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SEARCH AND RETRIEVAL INFORMATION information relevant to the needs of the user, and the need 

SYSTEM AND METHOD for the user to initiate separate queries for each topic of 

interest. 

FIELD OF THE INVENTION M .. ... 

No system currently exists that retrieves a flow of infor- 

The present invention relates generally to information 5 rnation from sources originating from multiple platforms 

systems, and more particularly to a broad-based information »nd operating systems while ensuring its relevancy to the 

system for searching and automatically retrieving informa- user. While users currently have access to many sources of 

tion stored across multiple platforms while parsing and information for managing their operations, the sources of 

filtering that information according to a particular commu- information are varied. Organizations need access to a wider 

nity of interest. io range of information and an ability to tailor that information 

RAnrrnniTMT. nc ttjc TKn/cvmr.Ki lo ^ s P ecific needs of thc use r- An information system is 

BACKGROUND OF THE INVENTION need ed wherein all potential sources of information can be 

Easy and efficient access to information has become easily and automatically searched, and only relevant infor- 

essential to maintaining an effective organization. Most mation is retrieved and displayed to the user. 

information is stored and accessed from discrete sources on 15 The present invention provides a system for search and 

internal networks where the organization's financial and retrieval of electronic objects, the objects including elec- 

document systems are maintained. However, because of tronically encoded information. The system is made up of at 

increased use of the Internet and newsfeeds, the demand for least a searching subsystem, which includes one or more 

external information has also grown. To access these differ- electronic lexicons in a memory within the system, and a 

ent sources of information, a wide variety of search and 20 format filter subsystem coupled to the searching subsystem. 

retrieval systems are used. However, many systems often The electronic lexicon provides predefined search query 

fail to deliver information which is relevant to the specific elements that are contextualized for specific communities, to 

needs of each user. Furthermore, many systems are limited identify objects that are relevant to the selections of specific 

by their inability to traverse different platforms and operat- individuals. The format filter subsystem includes several 

ing systems to search multiple sources of information and 25 format filter modules that operate to identify a format of an 

deliver the information to a single location. These limita- electronic object and then select a format filter module that 

tions are counterproductive to the user's needs for an easy will enable the system to search the object using the search 

and efficient method of accessing information across mul- logic elements within the lexicon. 

uple platforms. Xne present invention also provides for a method for 

Many existing search and retrieval systems require the search and retrieval of electronic objects, including identi- 

user to specify a query statement on search criteria. tying a format of an object to be searched, selecting a format 

Typically, such systems enhance the user supplied query filter module that is configured to enable searching, and 

using word associations similar to a thesaurus. However, searching the object using predefined search elements that 

because these word associations are generic, these systems are found in an electronic lexicon. An aspect of another 

often do not focus the search to the specific needs of the user. embodiment of the method is that retrieved objects may be 

Consequently, for the search to yield germane information, delivered to the user in a single viewing format 
the user typically either sifts through the information to 

determine its relevancy or performs several iterations of the BRIEF DESCRIPTION OF THE DRAWINGS 
search each time refining the search strings. This process is 4Q The invention may be more completely understood in 

time^nsuming and inefficient, consideration of the detailed description of various embodi- 

Similarly, some systems allow users to personalize their ments of the invention which follows in connection with the 

search and achieve a high degree of specificity. However, the accompanying drawings, in which- 

user must learn and use complex search syntax that is often FIG. 1 is an overview block diagram of a search and 

difficult for the user to understand, search iteratively, and 45 retrieval information system- 

carefully craft the query in order to obtain specific results. ftp. ? ;« , hwv ' r . 

For example, many search systems require the user to input ™' \ f * °. f * P 6 ™™ 1 COm P Uter 

search strings using Boolean operators. Hence for the rIG. 3 is a functional drawing of a lexicon using a search 

search to be effective, the user must be proficient with the rCtneVl1 mfonDatlon svstem ; 

usage of Boolean operators. Otherwise, the search may not 50 FIG ' 4 show ? exam P le of several lexicon entries in a 

produce useful information and may be too time consuming. Scarch and r6trieval information system. 

Another problem with many current search systems is that FIG ' 5 * a more detailed Dloc k diagram of a search and 

they require individual user-initiated queries, instead of re «"eval information system; 

providing a flow of highly relevant information on numerous * ^ a flowchart of a search and retrieval information 

topics. Separate queries arc usually required for separate 55 mctnod ; 

topics, and these queries need to be repeated by the user at FIGS. 7 and 8 are flowcharts of a process by which users 

appropriate time intervals. This need to initiate a separate personalize a target profile in a search and retrieval infor- 

query for each topic of interest further lengthens the process mation system and method; 

and exacerbates inefficiency. " FIG. 9 is a flowchart of steps taken by a query builder 

Furthermore, many existing search and retrieval systems 60 module in a search and retrieval information system and 

are limited to searching certain sources of information. This method; 

severely confines the usefulness of these systems because FIG. 10 is a flowchart showing steps taken by an indexing 

users are often required to perform the same searches on module in a search and retrieval information system and 

different systems to access all potential sources of method; 

information, boto internal to and external to the user's 65 FIG. 11 shows steps taken by an administrator to develop 

system. The inefficiencies inherent in this process are com- a lexicon in a search and retrieval information system and 

pounded in fight of the inability of most systems to retrieve method; 
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FIG. 12 is a flowchart showing steps taken by a pattern 
analysis module in a search and retrieval information system 
and method; 

While the present invention is amenable to various modi- 
fications and alternative forms, specifics thereof have been s 
shown by way of example in the drawings and will be 
described in detail. II should be understood, however, that 
the intention is not to limit the invention to the particular 
embodiments described. On the contrary, the invention is to 
cover all modifications, equivalents, and alternatives falling 10 
within the spirit and scope of the invention as defined by the 
appended claims. 



DETAILED DESCRIPTION OF THE VARIOUS 

EMBODIMENTS 15 

The invention is believed to be applicable to a variety of 
systems and arrangements which search and automatically 
retrieve information. The invention has been found to be 
particularly advantageous in application environments 
where system users require access to information which 20 
exists on different platforms and operating systems. While 
the present invention is not so limited, an appreciation of 
various aspects of the invention is best gained through a 
discussion of various application examples operating in such 
an environment. " 25 

Information and knowledge are essential raw materials 
and assets among many of today's workers, executives, and 
business owners. Current information management systems 
are typically focused on only one type of information 
repository. These management systems require that data be 30 
categorized or formatted in a specific way at the time of 
storage in order for that information to be available for 
search and retrieval. Therefore, current information man- 
agement systems are of limited utility to most users, who 
need to obtain useful information from multiple types of 35 
sources. 

Many executives and business owners are unable to 
obtain the desired kind or quantity of relevant information 
needed to serve business clients, help and improve their 40 
businesses, and to manage the activities of employee teams. 
To serve clients, many executives are in need of highly 
specific business and industry news and information. To help 
them run their own businesses, they need timely activity and 
status reports from their employee teams, which requires the 4J 
ability to search on their own internal databases. A wide 
variety of financial reporting information is also essential to 
many executives' and business owners' success. 

It is also important that information seekers are able to 
easily access information using a personal computer. 5C 
Another complaint about current information management 
systems is that they are difficult to connect to and difficult to 
direct toward highly specific information needs. Many infor- 
mation management systems require that a search query be 
written in complex Boolean logic statements. For highly 55 
specific information requirements, very complex Boolean 
statements and repeated alternate search strings arc often 
currently required. Even where a search engine may incor- 
porate natural language capabilities, users may need 
repeated alternative queries and iterative refinement of the 60 
search query so that it is specific to a giveD industry, product 
tv P e > geographic location, or time period. 

The present information search and retrieval system and 
method is designed to address these shortcomings in current 
information management systems. The present information 65 
search and retrieval system and method will also be referred 
to as an information appliance. The term information appli- 



ance is intended to refer to the system as a whole. The 
information appliance described can search diverse types of 
data and files because it is provided with format filter 
modules in order to be able to access information in various 
formats. For example, the information appliance may easily 
search documents that are internal to the user's system, or 
external commercial databases. Format filter modules may 
first identify an object's format, then use a specific filter 
module to read and search the object. The term "object," or 
"electronic object," will be used to refer to any type of 
electronic information that can be searched and accessed. 
Examples of electronic objects could be text documents such 
as newspaper articles, trade journal articles, report 
documents, or financial reporting information within an 
electronic database. In one embodiment the information 
appliance may also access, for example, Domino Notes® 
documents, relational database tables, object-oriented 
records, and other documents, records and databases. 

In order to help a user obtain highly relevant information, 
without knowledge of complex Boolean search string 
construction, the information appliance provides predefined 
search elements designed to identify electronic objects that 
will be most useful to the user's community. A list of topics 
for each community allows the user to easily describe the 
desired range of the search. Each topic may be associated 
with the predefined search criteria that is highly specific to 
the user's industry or community. An example of a topic is 
"general budgeting techniques." Some topics are more 
broadly indicative of an industry context, such as a geo- 
graphical context like "South America" or a time frame like 
"Spring, 1996." Some topics are more like subtopics, such 
as the subtopic "pricing strategy" being more specific than 
the topic "marketing plan." 

Each topic typically is linked to a predefined search query 
that is designed to gather information on the topic relevant 
to the user. The search query may contain words frequently 
found within a discussion relating to the topic and may look 
for those words in the object being searched. When the user 
specifies two topics, the information appliance may link the 
two search queries associated with the two topics and 
thereby execute a highly specific search. Throughout this 
description, the term "topic" will be used to refer to not only 
a topic but a subtopic and context association, where each is 
linked to a stored search element. A topic could also be a 
type of document, such as receivables reports or status 
reports. If the topic is a type of report, the search query may 
look for certain types of document names, or indicative 
words within the document. A topic may also be a specific 
document, storage location, or address, such as a web site 
address, where relevant information resides. In this case, the 
search element may be just the document name and location, 
so that the document is retrieved when the search is' 
executed. A user creates a new topic when the user combines 
a topic with a context association. A user may also create a 
new topic by defining a new search query entirely. The 
search query associated with each topic is herein termed a 
search atom or search element. When several search atoms 
are linked, the resulting highly specific search siring is 
termed a search molecule. 

FIGS. 3 and 4 illustrate a lexicon. FIG. 3 shows a lexicon 
32 as being made up of a library of topics 36 and a set of 
search elements 34. Within the present system, a lexicon is 
typically a storage framework of search elements or search 
atoms, each linked to one or more topics in the library of 
topics. When a topic is chosen by the user to be searched, the 
search query that is linked to the topic is used to carry out 
the search. When a topic appears to the user in the interface 
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for viewing selection, the topic may be linked to a single interface devices such as a touch screen or voice interface 

search atom, or the topic may be linked to a compounded (not shown). A communication adapter 37 is included for 

query or search molecule which incorporates more than one connecting the user terminal to a communication network 

independent search query. The type of links associated with link39. Agraphical user interface 41 is also connected to the 

each topic depends on the topic. s system bus 16 and provides the connection to a display 

FIG. 4 shows an example of the topics being linked to the device 43. It will be apparent to those in the art that the 

search elements in a different way of showing lexicon 32. mouse 29 may be a typical mouse as known in the industry, 

The community lexicon shown in FIG. 4 is for a community a trackball, light pen, or the like. 

termed Enterprise. Some topics within the Enterprise com- The user terminal typically has resident thereon an oper- 

munity lexicon include best practices: budgeting, capital, 10 ating system such as Windows®, Windows NT® Apple 

facilities, and operating. It is also possible to have more System 7®, IBM OS/2®, or UNIX® software The network 

specific subtopics within each topic. The right-hand side of also has a resident operating system, for example Novell® 

the chart of FIG. 4 shows search queries that are linked to Netware or Novell® Intranetware, among other 'possibili- 

each topic. For example, the search query linked to the topic ties. In the preferred environment, the desktop typically has 

capital" is very lengthy and lists examples of capital that 15 Internet browser software, such as MS Internet Explorer or 

might be found in a discussion of capital in the Enterprise Netscape Navigator. In the alternative, the network software 

context. In one embodiment of the present information operating system may not be available separate from the 

appliance, the user will be able to designate the topic work station operating system, and the network operating 

capital within the community "Enterprise" and have the system may have an integrated Internet browser Other 

benefit of the complex preprogrammed search criteria. M alternatives for client and server software include Oracle® 

FIG. 1 illustrates one particular embodiment of an infor- or Microsoft Sequel Server, 
mation system for searching and retrieving electronic A networked personal computer environment, a client/ 
objects across multiple platforms and operating systems. server system, a mainframe terminal environment WEB TV 
Referring to FIG. 1, the system 20 shown includes a search- terminal environment, dumb terminal environments a net- 
ing subsystem 22 which is capable of accessing data 24 ^ worked computer environment that is connected to an Inter- 
stored across multiple platforms and operating systems. net site or a personal computer alone could be used to 
Some data accessed may be internal to the system, while implement the information appliance. Any type of system 
some data may be accessible through a remote communi- that allows the user to receive target objecte and documents 
vfZ, f,f 6Cm l SyS K m aCC f SeS data reS0UrC6S & ° m *** ^ m in P m device to set up a user profile could be used 
f^^^^ S V^'^T^ t mtezbCeSWi,h> 30 ™ th SyS ' em - Depending\pon fhe user's needs, a 
format filter subsystem 26 to search data of varying formats. client/server system may be the most preferable computer 
Electronic lexicon 34 * used by the searching subsystem to system for implementing the information apphance 
search the data and identify electronic objects 30 which are FIG. 5 shows a more detailed diagram of several sub 
^cifically relevant to me ^formation needs of the user. systemsof an information system. Auirterm \n7fsIsZd 
Thus, the lexicon typically stores search elements that reflect 35 to input user information into the searchLg™bsystem 22 A 
the user s community of interests and returns only those community module 31 allows useTto S leTsi font 

The information apphance may be Jed with many dif- ^J^X^^^ 

^^^T^^^^^^^ W executives - accountants, higher education counselors, mem 8 

is provided with a user terminal, such as a personal bers of a corporation or members of a department within a 

computer, which may be linked to a modem, communication corporation. Each of these groups of people JSlW sZl 

hues ne work lines, a central processor, and databases. An a specialized professional or organizational vZbZy Z 

I^em tS^T' f0 7 Xam ?J 6 ' ^ bC tMS ^"^""^^dinthisapplicationreferstoalocation 

innm?h ™ < provides the user with a way to 45 in the system memory where search elements expressing the 
mput the user's preferences to the information appliance and special vocabulary for each group is stored S to be 

elnd W 7 ?T I T ° mC 0b ^ lti ^ d - ™ e P ref ««d * sophisticated, very targeted searching. Each lexLn 

embodunent of the information appliance may be practiced typically stores a bank of complex search query specific"" 

with the user terminal 15 of FIG. 1 being a personal tions 34 using the special vocabulary, or semluc contexT of 

computer such as an IBM®, Compaq®, Dell®, or Apple® 50 the group. Furthermore, within the a 

Macintosh® personal computer. As previously indicated, library 36 of topics, subtopics, contexT assocTa.^ and 

u^er terminal 15 may preferably be par, of a client/server document typesThat are of interest ,0 the ZS^eS 

system. A represen ative hardware environment of the user linked to one or more of the stored search queries 

terminal is shown in FIG. 2 The preferred hardware con- Aprofi i e mo dule 38 may be configu d t Tallowusers to 

figuration includes a central processing unit 17, such as a 55 choose topics or subtopii from TZvy of topS 

rmaoprocessor, and a number of other unite interconnected which are relevant user's current search need^ SimSrW tne 

!s ™v 3 ' ' ^ " ^ of a terminal ^ ««, specify or link additional conteSoE sucht 

15 may also be spread out over one or more mterconnected ™.™a l- 1 ^ -7 . uuerid Mien as 

computers or computer systems. ^connected specific geographic locabons, mdustnes, orcompany names, 

•nf , • , u . _„ „ llle term topic could encompass topics, subtopics or 

The user terminal shown in FIG. 2 also includes a 60 context associations that are listed Tin the Horary of to&6 

} 'J. * ^°, adapter 21 «™~ling penpheral receivables report or a status report, or to a storagelocation 
dev ces such as d*k storage unite 23 to the bus 16. A user such as a web site address. In addition, a user m y ie 1 Z'r 

a£ SlfifT f ^ lnpU ' dCViC6S * ° WD ^ ^ » topic with a context, fo 

also mcluded. Examples ofpossMe mput devices connected 65 example. For each user, the profile module is typical 

™ T r 4 daPter 25 u dU t a k ; yb ° ard 35 ' 3 10 wite lhis infoJauon to a targe pSelo 

mouse 29, a speaker 28, a miaophone 33, and/or other user which is accessed by a query builder module 46 dtecuLed 
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below, such that the target profile 40 stores the user's list of external sources. However, the autonomous agents may be 

topics for which automatic searching is desired. used for all local queries. 

In conjunction with user's target profile and topics speci- When the user chooses an electronic object to view, a 

fled therein, the present system is typically configured so retrieval module 60 within a retrieval subsystem 59 is 

. that the user can specify sources of data to be searched by s typically employed to access the index table and to select an 

accessing an atlas module 42 that creates the user atlas 44. appropriate format filter module 27 in order to return to 

The user atlas specifies the content sources, systems, and display the electronic object to the user in the appropriate 

locations to be searched by the information system on a display format. 

regular basis. The user atlas 44 may list internal database A pattern analysis subsystem 61 may be configured to 
locations, external database locations or both. 10 contain a pattern analysis module 62 that statistically ana- 
A query builder module 46 is typically employed to read lyzes me mformation °n the index table. Such a pattern 
the target profile. For each entry in the target profile the ^yns module typically produces raw statistics about the 
query builder module refers to the community lexicon to Placement and frequency of term occurrence. Additional 
build a master search query 48 which specifies the informa- Processing may optionally be performed on these statistics, 
tion the system must retrieve. By way of example, the master 15 either by me P attern analysis module 62 or by another 
search query 48 may be several search queries associated module designed to use these statistics. For example, in a 
with several topics linked together or the master search preferred embodiment of the system, the pattern analysis 
query 48 may be just one search query from the lexicon. A modalc 62 wffl parse the electronic objects on the index- 
typical process of bow the query builder module creates the table 58 . 811(1 statistically analyze the appearance of the 
master search query is described in more detail below. A 20 communltv lexicon terminology within each electronic 
master search module 50 may be employed to read the ot, j ec t. These object statistics 64 can be used to enhance the 
master search query and to create autonomous search agents community or individual lexicon based on the frequency or 
52 which perform the searches using the criteria in the infrequency of terms appearing in electronic objects satis- 
master search query and at the locations specified in the user fying . tbe searcn criteria. The object statistics may also be 
atlas. The autonomous search agents use an appropriate 25 com P iled across electronic objects for analysis of subject 
format filter for the document locations. Each autonomous data ' for exam P le - t0 review patterns of activity in the data, 
search agent searches the data 24 and returns electronic sm : tl as mer g er md acquisition data for specific companies 
objects that satisfies the search criteria in the master search or industries ov er periods of time. The activities of a typical 
query. The data 24 may be located internally to the computer P attern analysis module 62 will be discussed in greater detail 
system or may be found on external databases that are 30 m relalion t0 FIG - 12 - 

accessed via communication lines. The electronic objects ^9" ^ snows a more detailed diagram of a typical 

typically are returned to the master search agent for subse- association between topics 36 and the set of search elements 

quent indexing and delivery to users. 68 within the community lexicon 34 within a typical topic, 

An indexing subsystem 54 may be employed to create an mere exisls one or more 568100 logic atoms representing a 

index of all the terms in an electronic object. Using the 35 com P lex searc h query specification developed within the 

index, queries can be run quickly for indexed documents context of the terminology and concerns of the community. 

There are several different ways an indexing subsystem ^ ™?l * T topiC by s P eciryin g the 

module 54 could be used. An indexing module could create combmatlon of ^° ^pics, such as specifying a topic and a 

an index of all terms in all documents residing locally on the « *, f SS0clatl0n > lhe l^ry builder module may be con- 
system, enabling these locally resident documents to be 40 hgure fi d j° ^catenate the search atoms for each of the 

searched very quickly. In the alternative, indexing can be s P ecined to P lcs mto a compound search molecule. As an 

configured to be performed only when the local documents aU f matlve ' only one topic might be specified, associated 

are identified by the searching subsystem as being relevant j t S6arch at0m / In a ,yplcal system ' ^ 1 uer y build er 

The indexing module may also preferably create a usable , paS f S either mC search ltom or the com - 

index of all terms in each electronic object 56 returned from 45 5. * molecuIe ° nto me master search module 50, in 

external sources by the master search module ^ ° rm ° f 3 master search t l uery 48 - M Ascribed above, 

Documents that have been indexed are available for easy a t nt foreach 't^^T™^ ^T™™ 

future use within the community. When an index of a ,f I , molecule, or the master search query is used 

document exists, autonomous search agente 52 S are m l*?? , ^ ° f ^T* doasm ^ ^ mas «« ^arch 

not utilized. Rather, the master search quen -48 Kahv °r T ?** mf °i, miU0D ab ° Ut location 

operates directly on the index. * * * f ™ ^ to the autonomous search 

A typical indexing module accesses the format filter FIG 4 shows how nne nn-frnwi m . e 

subsystem 26 and selects the appropriate format filter mod- ass^ciario^bL^n T^^^TcolZ 

Si S ^ ^ CleC ' romC 10 40 iDdcX 55 ^eiaUons, and the correspond^ search dJLnto £e 

Uble 58. The index table may contain an abstract of each examp l e lexicon 32 shown in FIG. 4 is for Uie Enterprise 

electronic object and an electronic pointer to where the community, which is a community to may be S for 

documents will be carried out much more auicklv than „ u j- . c .l 3 ' 1 " l " lt ul 

searches of the entire document * ^ one embodiment of the present system is the ability of the 

, , , . .. ' system to concatenate search query elements into 

In one preferred embodiment of the system, autonomous sophisticated, highly specific queries, in order to limit a 
H^hL^hi " eU f d . t0 search external electronic 65 topic to a specific context, for example. A certain topic, such 

databases, while the indexing module is used to search local as a subject or report type, might be limited to a context 

electronic object and previously retrieved objects from such as time, place, specific companies or industrk For 
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each topic, subtopic, context association or report type, the 
system maintains a search specification that describes the 
topic in the syntax of the search agents and index module. 
The user can also identify other topics which may be of 
value. The search atoms may be concatenated into a longer, 5 
more complex search query. Thus, the search query is a 
highly refined search tool for the selected topics thereby 
allowing the system to effectively search and retrieve infor- 
mation that is specific to the needs of the user. 

In one mode of operation of the preferred embodiment, 10 
the information system can automatically search and retrieve 
electronic objects relevant to a user or community, and 
provide a stream of useful information to the user. FIG. 6 
outlines how this process is performed. The left half of FIG. 
6 shows the general steps that a developer or administrator 15 
of the present system may take to prepare the system for use 
by a specific community. Referring now to FIG. 6, format 
filter modules for each document source format are shown to 
be selected or created in step 80 to allow the system to search 
electronic objects of various formats. A community atlas is M 
also shown as being created and maintained in step 82 to 
specify the systems and information resources available to 
the user or community. Also, the community lexicon 32 may 
be created and refined at step 84 so that it contains appro- 
priate predefined queries for each topic, subtopic, context 25 
association, and report types. Similarly, a default target 
profile may be created 86 for a hypothetical average user in 
the community. In this step, the developer may choose topics 
from the library of topics 36 that are of broad interest to 
users to be stored in the default target profile. The default 30 
target profile typically will be provided to users when they 
first access the information appliance of the present 
invention, and can be used until each user creates a person- 
alized target profile. Individual default target profiles may be 
provided as well. 35 

Once these preliminary steps are performed, the query 
builder module can combine 88 the search atoms for each 
default community and individual target profile into com- 
plex search queries. Now the system can perform searching 
steps and stream objects to each user automatically. For each 40 
user or community target profile, the master search module 
may create an autonomous search agent at step 90 to search 
in the locations specified in the atlas. The autonomous 
search agent may be configured to select the appropriate 
format filter module to search documents of varying formats 45 
and return the electronic objects at step 92 which satisfy the 
criteria of the search molecule. In such a configuration, the 
system then typically checks to determine whether the 
electronic object has already been indexed for another user 
or community at step 94. If the electronic object has already 50 
been indexed, the index module may then create an abstract 
of the electronic object and store the electronic references 
for the object on the index table at step 98. If an abstract has 
already been created and rated, the system may proceed to 
step 102. If the electronic object has not already been 55 
indexed by the system, the appropriate format filter module 
may be accessed in step 96 by the index module to create an 
index of the electronic object retrieved by the autonomous 
search agent. At step 98, an abstract typically is created, 
evaluated and rated 100 for its effectiveness in fulfilling the 60 
target profile specifications for each search element for each 
user. The electronic objects that fulfill the target profile can 
be organized in a variety of views, for example, according 
to topic, subtopic, or context association, or report type and 
streamed to the user on demand at step 102. 65 

FIG. 7 shows how a user typically retrieves the electronic 
objects found by the system and personalize the target 



10 



profile. When accessing the information system for the first 
time 10, the user may be presented with a list of high-level 
topic areas at step 112 from the library of topics 36 that is 
designed for the community. Pointers and metadata 117 can 
be configured to store crucial information about content 
sources such as source location, source type, and other 
specifications. The user can select topic areas of interest for 
which electronic objects may be retrieved at step 114. The 
user can then select at step 116 the electronic objects of 
interest. The system typically delivers the first few sentences 
of each electronic object to the user's viewing frame. The 
user then selects an electronic object to read at step 118. That 
target object may then be streamed in its entirety to the user 
in step 120 so that the user can view the complete document. 
If the user would like to view another document at step 122, 
she may simply select another object from the list of 
electronic objects. When the user is finished viewing the 
found objects, another topic can be selected from the list of 
topic areas 124. At step 126, the user can personalize the 
target profile or exit the system at step 128. As discussed in 
greater detail in relation to FIG. 12, the system may be 
configured to keep track of which objects the user and all 
users choose to view in order to obtain statistical informa- 
tion about the popularity of objects. The continued steps of 
the user while personalizing the target profile and retrieving 
electronic objects are shown in FIG. 8. 

FIG. 8 shows a process by which a user can personalize, 
modify, or refine the target profile. The library of topics may 
be displayed 130 for each community lexicon associated 
with the user. The library of topics is stored in the commu- 
nity lexicon 32. If the user has already created personalized 
topics, as discussed further below, the user-defined topics 
may also be displayed at this time. After selecting the topic 
area of interest 132, the user typically is presented with a 
detailed list of subtopics 134 and context associations 136, 
where applicable and when available according to the struc- 
ture of the subject community. Lists of other types may be 
shown to the user at this point also. Report types or other 
types of topics may also be presented. Alternatively, the 
subtopics, context associations, report types, and all other 
topics may be displayed to the user simultaneously. From the 
lists, the user can select 138 the subtopics or context 
associations of interest and select additional links at 140. 
The user can further personalize the search string, if desired 
at step 142, by creating a free text search string 144. A 
personalized search string may be stored in a user lexicon 
145 along with other user-designated values. At this point, 
the user can include 146 another topic area. 

The user can also modify 148 the user atlas 44. The user 
atlas stores locations of databases where the information 
appliance will search. This feature allows the user to specify 
information sources where the search will be most produc- 
tive and results in a more efficient search by reducing the 
scope. If the user chooses to change the user atlas 44, the 
current atlas settings 150 typically are streamed to the user. 
Alternatively, for example, the current atlas options may be 
stored 152 in the target profile. From the list of atlas options, 
the user can select 154 the databases the system will search. 
After modifying the atlas, the changes made are stored in the 
user target profile 152. 

After the system performs a search pursuant to the criteria 
stored in the user target profile 152, the user can view 156 
the electronic objects returned by the system. After viewing 
the document, the user can remain in the system, returning 
to the list of topics at step 112 in FIG. 7, to continue to 
search, or exit the system 128. In the alternative, after setting 
up a search target profile, the electronic objects found in the 



03/29/2004, EAST Version: 1.4.1 



6,029,165 



11 



search could be streamed to the user's E-mail address on a 
periodic schedule. Returned target documents could also be 
returned to the user's local hard drive or another storage 
place on the user's network. This delivery route may be used 
to allow for perusal while disconnected from remote 5 
sources, or to allow the pattern analysis module to operate 
on the stored retrieved documents. 

FIG. 9 shows a more detailed flowchart of the query 
builder process by which the preferred system may carry out 
the user-defined search. At scheduled intervals, the query 10 
builder module concatenates search atoms associated with 
the topics in the target profile into search molecules. The 
process begins at step 180. The system will then select 182 
the first or next user or community target profile 40. The 
query builder module will read 184 the next topic from the J5 
target profile and identify the appropriate lexicon. If a user 
has already personalized a target profile, then the system will 
be accessing the user target profile at this time. However, if 
the user has not yet created a personalized target profile, the 
default community target profile will be accessed. M 

In the embodiment shown in FIG. 9, the search query for 
that particular topic is read 186 and placed in temporary 
storage. The query builder module will then determine if 
there exists a context association 188 or other relevant topics 
or subtopics to be combined for the specified topic or 25 
subtopic. If no context association or other topic or subtopic 
exists, the search molecule 194 is complete. Otherwise, the 
query builder module will read and concatenate 190 the 
search atoms for the context association or other relevant 
topics or subtopics to those already in temporary storage. 30 
This concatenation process will continue 192 until no further 
context associations or other relevant topics or subtopics 
exists. Once the concatenation process is completed, the 
concatenated search molecule is available for use by the 
master search module 50, shown in FIG. 5. If there is another 35 
unprocessed topic 196 in the current target profile, the 
process starting at step 184 is repeated. Similarly, the query 
builder module checks for unprocessed users or community 
target profiles 198. The system returns to step 182 to process 
additional users or communities. If these do not exist, the 40 
query builder process 199 will end until the next scheduled 
iteration or user-initiated search. 

FIG. 10 shows additional detail of the indexing process. 
The indexing module is shown to be started 200 either when 
a query is made against the indexing module or when an 45 
electronic object is returned by an autonomous search agent 
to the master search module for storage. As discussed above 
with reference to FIG. 5, there are many different possible 
ways for the indexing module to operate on internal 
documents, exlernal documeuts or both. Assuming that all 50 
interna] electronic objects will be indexed, the indexing 
module reads the storage references 202 written by the 
master search agent to see which servers, directories, and 
databases have material for indexing. The indexing module 
then determines 204 whether the format or structure of the 55 
electronic object is that of a file or database. 

In the process shown in FIG. 10, if the structure is a file, 
the indexing module reads the file extension, header, and' 
initial bytes of the file 206 to determine the file format. Thus, 
the appropriate format filter module can be selected. The 60 
indexing module then determines whether the object has 
been written to the indexing table 208 since the previous 
indexing module run If it has not been updated since an 
index was last created or the object has never been indexed, 
the next file is read and its format is determined for indexing. 65 
Otherwise, the indexing module will access 210 the appro- 
priate filter module so that it can read the document and 
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create a new full-text index and abstract for the updated 
electronic object. 

If the format or structure of the electronic object is that of 
a database, the indexing module typically negotiates 212 the 
database's security and access the database. Next, the index- 
ing module will select 214 the appropriate format filter 
module to format the electronic object for indexing. Finally, 
the indexing module will check to ensure that the electronic 
object has not already beeD written to the indexing table 216. 
After the indexing module has read the object and created an 
index and abstract of the electronic object 210, the indexing 
module stores 218 a full-text index abstract and location 
reference for the electronic object. 

The indexing module typically ascertains whether the 
electronic object is a file or database server at step 220. If 
more files in a directory structure 222 or more objects in the 
database 224 still remain, the process is repeated starting at 
step 206 or step 214 respectively. Furthermore, if another 
database exists 226, the process of negotiating the security 
and accessing that database is continued starting at step 212. 
Finally, the indexing module will determine whether or not 
another server needs to be indexed 228. If so, the process of 
reading the storage references in the system will be repeated 
starting at step 202. If no more servers need to be indexed, 
the indexing module is terminated 230 until the next itera- 
tive cycle. 

FIG. 11 shows how a lexicon may be developed for a 
specific community. First, a specific audience typically is 
identified 252 based upon the business rules or other frame 
of reference common to the audience. Members of the target 
audience can be interviewed 254 to determine the a) types 
and sources of new topics of interest to the community; b) 
the types and sources of learning and business performance 
improvement subjects of interest to the community; c) the 
types and sources of technical subjects of interest to the 
community; d) types and sources of financial and business 
management systems reports of interest to the targeted 
community; and e) the types and sources of other business 
documents and on-line discussion topics or subjects of 
interest to the community. Based on these interviews, the 
topic areas of highest priority 256 typically are identified. 
These interviews will also identify important data locations 
that may be made available for selection in the atlas. 
Lexicon development may proceed by reviewing 258 the 
vocabulary that applies to this audience, for example by 
referring to professional dictionaries or articles and by 
drafting 260 a high-level framework of the topic areas for 
the community. Where required, this high-level framework 
is further broken down into subcategories 262 for the topic 
area. The lexicon developed to this point should be tested 
264 using focus groups to ensure that the terminology is 
within the framework of the topic concepts used by the 
community. 

In developing the lexicon, another important aspect of the 
preferred embodiment is to ensure that topic areas are 
separated 266 into stand-alone lists where possible, such as 
industries, geographic locations, and company names. This 
serves to minimize the hierarchical relationships and maxi- 
mize the many-to-many relationships for the query builder 
to concatenate. Predefined search queries 268 should be 
created for each topic, subtopic, context association, or 
report type utilizing the linguistic context of the community 
and the desired information resources that will be part of the 
system. These predefined search atoms should be tested 270 
against appropriate content and refined accordingly. Similar 
to the predefined search topics and subtopics, each element 
of each free-standing context list typically is defined in the 
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search elements within the community lexicon, and wherein 
each topic identifies a subject or concept of interest that is 
relevant to the information needs of the community. 

22. The method of claim 21, the method further compris- 
ing creating a target profile by selecting at least one topic s 
from the library of topics. 

23. The method of claim 19, the method further compris- 
ing creating a user atlas by selecting at least one preferred 
data resource from a list of data resources from which 
objects may be retrieved. 10 

24. The method of claim 19, the method further compris- 
ing: 

creating a target profile by selecting at least one topic 
from a library of topics, wherein each topic is associ- 
ated with one or more of the predefined search elements 15 
and each topic identifies a subject that is relevant to the 
information needs of a community of users; and 

creating an electronic master search query by concatenat- 
ing the search elements associated with each topic 
listed in the target profile. 20 

25. The method of claim 24, wherein the electronic master 
search query is used to search the object. 
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26. The method of claim 19, wherein the step of searching 
the object is scheduled to occur automatically at specified 
time intervals. 

27. The method of claim 19, further comprising creating 
an index of an object that was identified in the searching 
step, by compiling and storing in computer readable medium 
summary information that identifies the object. 

28. The method of claim 19, further comprising sifting 
through the objects identified in the searching step to rec- 
ognize and count words within each object that are in the 
lexicon. 

29. The method of claim 19, further comprising locating 
terms within the identified objects according to frequency 
and location of the terms in relation to words in each object 
that are in the lexicon. 

30. The method of claim 19, further comprising: 
retrieving objects that are identified in the searching step; 
recording a number of times that each object has been 

identified; and 

reporting the number of times that each object has been 
identified. 

***** 
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INFORMATION SEARCH APPARATUS AND unfolding the query word "refreshing" to keywords which 

METHOD are associated with that query word. However, when such 

scheme is used, images which are not "refreshing" may be 
mixed in search results. 

BACKGROUND OF THE INVENTION c , • T , . 

3 In this manner, the operator cannot designate query con- 

The present invention relates to an information search ditions for obtaining a desired search result with respect to 
apparatus and method for searching information on the basis a search request indicated by the input keyword, and cannot 
of an input query word. More specifically, the present obtain an intended search result. For example, even when 
invention relates to an information search apparatus and the operator wants to find only images having "refreshing" 
method for managing a plurality of kinds of multimedia 10 feature patterns with respect to a search request "refreshing", 
information, and searching the managed multimedia infor- images having content words associated from the search 
mation for desired multimedia information, and a computer request "refreshing" such as a music score of a "refreshing" 
readable memory. music, a "refreshing" athlete", and the like, are presented, 

A conventional information search apparatus, which i-e., images which do not match the search request are 
searches multimedia information, e.g., image information, 15 presented. 

makes a search using data (keywords) derived from subjec- In place of a query word, a query image may be input, and 
tive evaluation results of one or a plurality of persons for test a search may be made using its feature amount However in 
images, physical image features extracted from images, and this case, a query image which reflects the searcher's will 
likc - must be prepared, and it is difficult to select a query image, 

When an image is searched for using a keyword, a resulting in poor operability. 
required image is obtained by matching a given keyword 

with that corresponding to the image. Also, a scheme for SUMMARY OF THE INVENTION 

obtaining an image, that cannot be obtained by full-word . 

matching with an input keyword, by matching not only the ™f preSeDt mven "f on ha f been made ^ consideration of 
input keyword but also an associated keyword associated 25 ^ above - ment i°ned problems, and has as its object to 
with the input keyword with a keyword corresponding to an P 6 f ™ agC Search method aad a PP arat "s which can 
image, is proposed. Furthermore, a search scheme which . the lnformatlon wan,ed with high precision with 

obtains an image with similar color information by detecting reSp '° ™ mpU qUery Word - 

a correspondence between the input keyword and color In order lo ac hieve the above object, according to one 
information using, e.g., color information of images is 30 as P 6ct of me present invention, there is provided an infor- 
proposed. " mation search apparatus for searching information based on 

In the image search using keywords, an impression that a ^ "^V 1 . query word ' com P r ising first search means for 
person receives upon watching an image, or key information determining a query keyword on the basis of the query word, 
linked with the impression is appended to image information ? nd searchm 8 information on the basis of the query 
and is used in search. As the key information, words that kevword , second search means for determining a feature 
express impressions evoked by images such as "warm" ^ ount corresponding to the query word, and searching 
"cold", and the like, and words that represent objects in ^formation on the basis of the feature amount, setting 
drawn images such as "kitty", "sea", "mountain", and the meanS f ° r ^"^S 1 search weight to be assigned to search 
like are appended as keywords. Also, local image feature rcMlltS ° f ^ c first second means, and integration 

components on drawn images are subjectively evaluated and 4 ° meaDS f ° r mte S raun g ^arch results obtained by the first and 
are often appended as key information. For example infor- second search means in accordance with the search weight 
mation that pertains to a color such as "red", "blue", and the Set by Ae setting means ' 

like, information that pertains to a shape such as "round", In order to achieve the above object, according to another 
"triangular", "sharp", and the like, and information that 4S as P ect of ^ Present invention, there is provided an infor- 
pertains to a texture such as "sandy", "smooth", and the like mation search method for searching information based on an 
are expressed using words and icons, are appended to ""P^ 1 t 3 uerv word > comprising the first search step of deter- 
images as key information, and are used in search. mining a query keyword on the basis of the query word, and 

In a system in which physical image feature amounts are searching information on the basis of the query keyword, the 
extracted from images, and are used in image search, image 50 second search step of determining a feature amount corre- 
features include local colors painted on images, overall color s P ond ing to the query word, and searching information on 
tones, and shapes, compositions, textures, and the like of 1110 basis of mc fcature amount, the setting step of setting a 
objects on drawn images. An image feature amount is search weight to be assigned to search results in the first and 
extracted from segmented regions or blocks obtained by second search steps, and the integration step of integrating 
segmenting the overall image into regions based on color 55 SCarch results pbtained in the first and second search steps in 
information, or segmenting the image into blocks each accordance with the search weight set in the setting step, 
having a given area, or is extracted from the entire image. Other features and advantages of the present invention 
Physical image features include, e.g., color information, wil \ be apparent from the following description taken in 
density distribution, texture, edge, region, area, position, conjunction with the accompanying drawings, in which like 
frequency distribution, and the like of an image. 60 reference characters designate the same or similar parts 

However, in the above search scheme, when an image throughout the figures thereof, 
including a keyword that matches the input query word is ^ 

searched for, images which do not match the search request BRIEF DESCRIPTION OF THE DRAWINGS 

of the searcher are often obtained. Especially, when an The accompanying drawings, which are incorporated in 
mage search is made using an abstract query word such as 65 and constitute a part of the specification, illustrate embodi- 
L-^t ^1 ^ T gCS f ° Wi u y the L search ™ menls ° f ^ invention and, together with the description, 
limited. To solve this problem, a search may be made by serve to explain the principles of the invention 
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FIG. 1 is a block diagram showing an example of the DESCRIPTION OF THE PREFERRED 

arrangement of an information processing apparatus which EMBODIMENTS 

constructs an image search apparatus according to an n <■ j u j- r %. 

, ■• , f ,u .• f The preferred embodiments of the present invention will 

embodiment of the present invention; . . , • , . ., . / . , 

now be described in detail in accordance with the accom- 
MG. 2 is a block diagram depicting the processing S panym g drawings. 

arrangement in the image search apparatus according to the err' 1 -i™,„ ._«„_. i c .u 

.«iJV~ . e.u . ■ J "G. 1 stows an example of the arrangement of an 

embodiment of the present invention; ;„fr.m„t; „ • , L - u 

intormation processing apparatus which constitutes an 

FIG. 3 is a view showing a display example of search image search apparatus according to this embodiment 

perspectives in association with a search request word input Refcrring to FIG t reference nmneraI u deno , es 

in a search request mput process 201; microprocessor (to be referred to as a CPU hereinafter), 

FIG. 4 is a view showing a display example of a weight which makes computations, logical decisions, and the like 
designation control panel for designating a search weight for for image information search in accordance with control 
a search using associative words, and a search weight for a programs, and controls individual building components con- 
search using sensory patterns, in the search request word ]5 nected to an address bus AB, control bus CB, and data bus 
input in a search request input process 201; DB via these buses. The address bus AB transfers an address 

FIG. 5 is a table showing the data structure of an image signal indicating the building component to be controlled by 

holding unit 218 which stores image IDs in correspondence the CPU 11. The control bus CB transfers a control signal for 

with image file storage paths; each building component to be controlled by the CPU 11. 

FIG. 6 is a table showing an example of the data structure 20 ^ ^ bus DB transfers data among the respective build- 

of an image content word holding unit 219 which stores components. 

image IDs in correspondence with image content words; Reference numeral 12 denotes a read-only memory (to be 

FIG. 7 is a table which stores data of the image content referred to as a ROM hereinafter), which stores a boot 

word holding unit shown in FIG. 6 as a list of image IDs Processing program and the like executed by the CPU 11 

using image content words as keys; " 25 u P on sorting up the apparatus of this embodiment. Refer- 

FIG. 8 is a table showing an example of the data structure 13 f de "°, t6S a ^f 10 . ra ^T access 

of a concept discrimination dictionary 205- memMy 5° u be , f* md to 35 a RAM hereinafter) which is 

mr a „ .„ki» „t ■ , * . j configured by 16 bits per word, and is used as a temporary 

FIG. 9 is a table sho wing an example of the data structure storage of various data from the respective building com- 

of an associative word dictionary 211; 3Q ponents . in ^ ^^jj^ ^ f 3 pro . 

FIG. 10 is a table for explaining the data holding format vides various data holding units such as a query word 

m a search result holding unit 216; holding unit 202, search perspective holding unit 203, search 

FIG. 11 is a table for explaining an example of the data weight holding unit 204, determined weight holding unit 

structure of an unfolded sensory pattern holding unit 213 207 > unfolded associative word holding unit 209, unfolded 

shown in FIG. 2; 3S sensory pattern holding unit 213, and search result holding 

FIG. 12 is a table showing an example of the data unit 216, which will be described later with reference to FIG. 

structure of an image word/sensory pattern correspondence 2 ' 

holding unit 215 shown in FIG. 2; Reference numeral 14 denotes an external memory 

FIG. 13 is a table showing the data structure of a sensory i„ ISK ^' whici 5torcs a ^^P 1 discrimination dictionary 

pattern holding unit 220 shown in FIG 2- 40 5 ^ ^dative word dictionary 211, and provides data 

FIG. 14 is a table showing a data example obtained upon « * 35 ^ « i 8 ' word/sensory pattern corre- 

extracting image feature amounts from a smgle toTge by an ffi^Sf T ^ Z 8 * ^ ™ d h ° lding 

image feature amount extraction process; ™ hoMn.g unit 218 sensory pattern holding 

„ n 1e . . . . F unit 220, image feature amount holding unit 222, and image 

FIG 15 is a table showing an example of image feature 45 feature amount/sensory pattern correspondence holding unit 

amounts ,n this embodiment, which are obtained by extract- 223, which will be described later with reference to FIG 2 

ing representative colors in units of image regions/blocks; As a storage medium of the external memory 14, a RONlj 

FIG. 16 is a table showing a storage example of an image floppy disk, CD-ROM, memory card, magnetooptical disk' 

feature amount holding unit 222 shown in FIG. 2; or the like can be used. 

FIG. 17 is a table showing a data storage example of an 50 A ^ so ' the external memory 14 stores programs for respec- 

image feature amount/sensory pattern correspondence hold- uvel y implementing the respective processes, i.e., a search 

ing unit 223 shown in FIG. 2; request input process 201, weight determination process 

FIG. 18 is a flow chart for explaining the operation of the 206 ' associative word unfolding process 208, image content 

present invention; * word search unit 210 using associative words, sensory 

FIG. 19 is a flow chart showing the details of the search 55 P atfc m unfolding process 212, sensory pattern search pro- 
request input process 201 (step SI in FIG 18)- C6SS ' searcb result integration process 217, image fea- 

FIG. 20 is a flow chart showing the details'of an asso- P'-,^ X0S0Ty pMem 

ciative word unfolding process 208 and an image" Zt JSS^ ESS f t^rX ^ 

word search process 210 using associative words (step S4 in „„ XdS?,n22?« / .i ., PKSX ^ S m 
pin iky v y^-nu 60 loaded onto the RAM 13 as needed, and are executed bv the 

~_ a) ' CPU 11. 7 

™S™ fl ° W Chart ^ WiDg ^ ° f 3 Reference numeral 15 denotes a keyboard (KB) which has 

cir ■>■>■ « ? ""egrauon process iY l, and b ol mput keys for mputtmg a period, comma, and the tike, 

J* f " 2 T chart showing an example of a pre- 65 a search key for instructing a search (a function key on a 
process of a search, which is done upon registering an general keyboard may be used instead), and various function 
™ a 6 e - keys such as cursor moving keys for instructing cursor 
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movement, and the like. Also, a pointing device such as a reference to an associative word dictionary 211, obtaining an 

mouse or the like (not shown) may be connected. antithetic concept antonymous to that query word from the 

Reference numeral 16 denotes a display video memory (to concept discrimination dictionary 205, and unfolding the 

be referred to as VRAM hereinafter) for storing a pattern of obtained antithetic concept into associative words with 

data to be displayed. Reference numeral 17 denotes a CRT 5 reference to the associative word dictionary 211. Reference 

controller (to be referred to is a CRTC hereinafter) for numeral 209 denotes an unfolded associative word holding 

displaying the contents stored in the VRAM 16 on a CRT 18. unit f° r holding the associative words (including those of the 

Reference numeral 18 denotes a display device (CRT) using, antithetic concept) unfolded by the associative word unfold- 

e.g., a cathode ray tube, or the like. The dot display pattern m S process 208. Reference numeral 210 denotes an image 

and cursor display on the CRT 18 are controlled by the 10 content word search process using associative words (to be 

CRTC 17. Note that various other displays such as a liquid referred to as an image content word search process 

crystal display, and the like may be used as the display hereinafter), which finds image content words, that are 

device. Reference numeral 19 denotes a network controller stored in an image content word holding unit 219 and match 

(NIC), which connects the image search apparatus of this me unfolded associative words, by search with reference to 

embodiment to a network such as Ethernet or the like. 15 the unfolded associative word holding unit 209. Reference 

The image search apparatus of this embodiment consti- numeral 211 denotes an associative word dictionary for 

tuted by the aforementioned building components operates storing associative words in units of concepts serving as 

in accordance with various inputs from the keyboard 15 an index words in correspondence with associative perspec- 

various inputs supplied from the network controller via the uvcs ( mis P roccss will bc described in more detail later with 
network. Upon receiving the input from the keyboard 15 or 20 referenc e to FIG. 9). 

network controller 19, and interrupt signal is set to the CPU Reference numeral 212 denotes a sensory pattern unfold- 

11. Upon receiving the interrupt signal, the CPU 11 reads out mg process for unfolding the query word stored in the query 

various control data stored in the external memory 14, and word holding unit 202 into sensory patterns with reference 

executes various kinds of control in accordance with these to 80 image word/sensory pattern correspondence holding 
control data. Also, the present invention is achieved by 25 unit 215, obtaining an antithetic concept antonymous to that 

supplying a storage medium that stores a program according query word from the concept discrimination dictionary 205, 

to the present invention to a system or apparatus, and by and unfolding the obtained antithetic concept into sensory 

reading out and executing program codes stored in the patterns with reference to the image word/sensory pattern 

storage medium by a computer of the system or apparatus. correspondence holding unit 215. 

FIG. 2 is a block diagram depicting the processing 30 Reference numeral 213 denotes an unfolded sensory 

arrangement in the image search apparatus of this embodi- pattern holding unit for temporarily storing the sensory 

ment - patterns unfolded by the sensory pattern unfolding process 

Referring to FIG. 2, reference numeral 201 denotes a 212 - Storage of data in the unfolded sensory pattern .holding 
search request input process for inputting query items (query 35 unit 2:13 will be described later with reference to FIG. 11. 

word, search perspective or category, and search weight in Reference numeral 214 denotes a sensory pattern search 

this embodiment; to be described in detail later) that pertain process for finding sensory patterns, which are stored in the 

to the information wanted. Reference numeral 202 denotes sensory pattern holding unit 220 and are similar to the 

a query word holding unit for storing a query word input by unfolded sensory patterns, by search with reference to the 
the search request input process 201. Reference numeral 203 40 sensory pattern holding unit 220. 

denotes a search perspective holding unit for storing a search Reference numeral 215 denotes an image word/sensory 

perspective input by the search request input process 201. pattern correspondence holding unit for storing the corre- 

Reference numeral 204 denotes a search weight holding unit spondence between the image words and sensory patterns bv 

for stonng a search weight input by the search request input storing sensory pattern IDs corresponding to sets of image 
process 201. ^ words ^ associative words associated with the image 

Reference numeral 205 denotes a concept discrimination words. Note that the image word/sensory pattern correspon- 

dictionary having a search perspective that pertains to a dence holding unit 215 will be described in detail later with 

concept as information wanted, an antithetic concept having reference to FIG. 12. 

a meaning contrary or antonymous to the concept as the Reference numeral 216 denotes a search result holding 
J*™"; Wa ° ted ' and two kinds of coefficients 50 unit for storing image IDs found by searches of the image 

(associated weight and sensory pattern weight) for weight content word search process 210 and sensory pattern search 

d^crumnation upon searching for a concept, as shown in process 214. Reference numeral 217 denotes a search result 

MO. 8. Note that the concept discrimination dictionary 205 integration process for integrating the search results of 

will be described in detail later with reference to FIG. 8. image content words using the associative words, and the 
Reference numeral 206 denotes a weight determination 55 search results of sensory patterns stored in the search result 

process for giving weights (associated weight and sensory holding unit216, on the basis of the search weights obtained 

pattern weight) mdicatmg the weight balance on associative by the weight determination process 206 and stored in the 

words (obtained by an associative word unfolding process- determined weight holding unit 207 

a query word sfored in the* query Jord hSg unitTo? *° "ZfJS^T^™ * ^ ™ CiaCe 

Refcr.ncen^riWdenoto.&te.n^SStS vZltn! T, tes ™™& ^ ™? ^ding unit for 

unit for holding the search weigh, determined by weTghl ^£ Z17 holl? T' 8 * *"* 

determination process 206 o f ™ a8e ho TlL 218 to ex P ress ^ir contents. 

Referent n,, mP roi ™» * . , Reference numeral 220 denotes a sensory pattern holding 
Reference numeral 208 denotes an associative word 65 unit for storing matching levels between imaee information 

query word holding unit 202 into associative words with and sensory patterns More specifically, the sen^attern 
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holding unit 220 stores the matching levels with sensory using associative words and sensory patterns in the search 

patterns in units of image IDs. The data contents of the request input process 201. As described above, in this 

sensory pattern holding unit 220 will be described later with embodiment, a search using associative words and a search 

reference to FIG. 13. using the feature amounts of images (sensory patterns) based 

Reference numeral 221 denotes an image feature extrac- s ? a tnc 1 uer y word arc made, and the search results are 

tion process for extracting physical image feature amounts integrated. In this integration process, the two search results 

from image information stored in the image holding unit are wei S nted - 0n we weight designation control panel, the 

218. Physical image feature amounts are visual features or user 0111 designate a search weight for a search using 

signatures extracted from regions segmented on the basis of associative words and that for a search using sensory 

color information, blocks each segmented to have a eiven 10 P attems - ^ »?> us" c m designate the weight balance 

area, or the entire image. For example, the image feature is Z^ aT^s^^ " ^ ^ 

numerical information such as the color distribution or P ' ?T actu ^ e ^ rcn ; 

histogram, density distribution, texture, edge, frequency , i^lf ^ ^ when the user slides a slide button 41 

distribution, and the like of an image. Note that the image ? " M mstructlon lhat sets a hea ™ r on a 

feature amounts will be described in detail later with refer- 15 .^.T 01 ^ WOrdS " jssned; When he 0r Sne 

ence to FIG 14 sudes shdc but, °n 41 to the right, an instruction that sets 

Reference numeral222 denotes an image feature amount ^ ierwi g ht ° na ^busing sensory patterns isissued. 

•. c . ■ -l • r- 6 n-oiuisi omuiuii When the user designates search weiehts usinp the <;liHe 

holding unit for storing the unage feature amounts obtained button 41 ^ then * sses „ qk buTon 43 SftSht 

sensory pattern despondence holding unit for storing £££ ^^^S^^^ 

pattern corresponding holding unit 223 sto^XTaN ^ !? ^ ^P^ ™ at ion dictionary 205 (FIG. 

tern IDs and Lge LhrrTfrnount data clffi * 5f^*^ 1 

those IDs. Note that the data structure of the image feaLe ZZ, g " nel "t v^,, t ?t ? °V he 

amounl/sensory pattern correspondence holding unit 223 s Zwn P * by * P ° mUng dcwcc (n0t 

will be described in detail later with reference to FIG. 17 tu . ^ „ , . 

Reference numeral 224 denotes a sensory pattern deter- 3o de^LTbelow ^FIG T ^ *" ^ ^ 

mination process for comparing image feature amounts 30 oir e u ,u A ' , ■ . 

extracted from a sensory pattern and image information and „?7 °? of the image holding unit 

determining their matching level with reference to the image «, , ^ "correspondence with image 

feature amount holding unit 222 and image feature amount/ f'T^ P "m^ !? n& $ ' Khl6nC * DUmeral 51 

sensory pattern correspondence holding unit 223 and re E - f Bn ™ ag f ID Whlch 1S aQ ldentiflca 'ion number 

istering the result in the aforementioned sensory pattern 35 TZt * ^,1? ™ a g e j- e « this image database, 

holding unit 220 Reference numeral 52 denotes a file path which indicates the 

Adisplay example of a search perspective that pertains to mT ^ fil f, COrre f ondin e <° ,he ™& 
search request iterns input a, JseaL request ^ut pro- S^JSJ ^ » ^ 

c-ng urn, 201 will be explain* below with reference to 40 ^ ^^1^^ ^ ^ data fields 

the search perspective. When thTuse P resL ^chSanOK 5 ° ofMl « os °f' Corp. as such image format, but other 

k„..~ • ,u- . . =-™" cul ™ 5USCr presses (^cucacsj an UK compression formats such as GIF, JPEG FlashPix and the 
button in this state, the search perspective "color tone" is like may be used riasnrix, and the 

selected, and is held in the search perspective holding unit -n,. I, ^ t.u • 

203. the strucmre of the image content word holding unit 219 

d„ „ • r . , be described below with the aid of FIG 6 

By pressing one of the cursor moving keys on the key- „ prr. a eh*™ , .7 ' 

board 15, the hatching moves from "color tone" to "taste" or • f ^ ? X ?!? ple ° f °* data structure of ^ 

"general atmosphere", and the user can d«^a£ a^ired "T' ^ ^ 219 ^ St0reS ^ 

search perspective or category. Note mat "mikT as the quTry ^nr ^^P 00 ^ ^/^f 00016,11 words - ReferriD 8 

word is held in the query word holding unit 202 and the ° 6 > rcferen « 61 denotes an image ID, which 

selected search perspective ("color tone" in FIG 3) U h^ d 60 Ze^OH ^ ™ " " RefeKnCe 

in the search perspective holding unit 203 40 "Tf V 1 62 deDOteS 311 ma e 6 °° ntent word which stores a 

Adisplay example on the contol panel when the operator W 1 ° rd J 0rex P ressin 8 each lma 8 e identified by the image ID 

instructs the search weight balance ^ ^ K a keyWOrd which verball y 

j j wcigui uaiance on a searcn usmg asso- expresses an image feature in an image and is stored a<: a 

ciative words and a search usmg sensory patterns in actual r h»r<ri« t ■ a \ T , ma f c ' m ° K SWKa 15 a 

search will be explained below with reflrenceTnG 4 fi5 ( g " ^ Aplurallt y of ma y 

Fir, d =h^wc , i i ' cnt T 10 rlu ' * « be stored per image, and the image content word holding 

J2 Iti f ^lay sample of a wight designation unit 219 is constructed as a list of image content wordH! 

control panel for instructing search weights for searches using image IDs 61 as keys. 
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FIG. 7 shows a lable which stores the data of the image matched associative words is zero, the association matching 

content word holding unit shown in FIG. 6 as a list of image level 103 also stores zero. 

IDs using image content words as keys. Referring to FIG. 7, Reference numeral 104 denotes a field for storing the 

all image IDs that contain image content words 71 as number of sensory pa , tems ^ M ^ est similarit wbich 

keywords are stored as image IDs 72. s are found by search ^ by m6 sensory , J lMcm 

The structure of the concept discrimination dictionary 205 223. Reference numeral 105 denotes a sensory pattern ID list 

will be described below using FIG. 8. wh i c h stores a maximum of 20 sensory pattern IDs of 

FIG. 8 shows an example of the data structure of the matched sensory patterns. When the number 104 of matched 
concept discrimination dictionary 205. The concept dis- sensory patterns is zero, the sensory pattern ID 105 is filled 
crimination dictionary 205 provides information that per- 10 with NULL code. Reference numeral 106 denotes a field for 
tains to a query word serving as a search request, and is a storing the search matching level of a sensory pattern search 
table which stores index words 80 corresponding to query with respect to the image ID 100. When the number 104 of 
words, search perspectives 81 associated with index words, matched sensory patterns is zero, the sensory pattern match- 
antithetic concepts 82 having meanings contrary to the index ing level 106 stores zero. Reference numeral 107 denotes a 
words in units of search perspectives, associative word 1S field for storing an integrated matching level (obtained by 
weights 83 used upon searching the index words, and the search result integration process 217) of the image ID 
sensory pattern weights 84 used upon searching the index 100 with respect to the search request, which is calculated 
words in correspondence with each other. using the associative matching level 103 and sensory pattern 

The structure of the associative word dictionary 211 will matching level 106 as parameters, 
be explained below with reference to FIG. 9. 20 The structure of the above-mentioned unfolded sensory 

FIG. 9 shows an example of the data structure of the pattern holding unit 213 will be described in detail below 

associative word dictionary 211. The associative word die- with reference to FIG. 11. 

tionary 211 is composed of associative IDs 90 each of which FIG. 11 is a table for explaining an example of the data 

assigns a unique number to a set of associative words for structure of the unfolded sensory pattern holding unit 213 

each mdex word, index words 91 each serving as a start shown in FIG. 2. Referring to FIG. 11, reference numeral 

point of association, associative words 92 evoked by the 110 denotes an image word as an unfolding source from 

mdex words 91, associative perspectives 93 which are which this sensory pattern has evolved upon unfolding and 

relevant to associations of the associative words 92, and the same image word as that in the query word holding unit 

association strengths 94 each indicating the strength of 202 is stored. In this embodiment, a character strine 

association between each pair of index word 91 and asso- "refreshing" is stored, and ends with NULL code. Reference 

ciative word 92. numeral 111 denotes the number of sensory patterns 

In this embodiment, the association strength 94 assumes obtained by unfolding the image word 110 with reference to 

an absolute value ranging from 0 to 10, and its sign indicates the image word/sensory pattern correspondence holding unit 
direction of association of the associative word. More 3S 215. For example, when the contents of the image word/ 

specifically, when the association strength is a positive sensory pattern correspondence holding unit 215 are as 

value, it indicates a stronger associative relationship (higher shown in FIG. 12, the number of sensory patterns unfolded 

bilateral association) as the association strength value is from the image word "refreshing" is 7. Reference numeral 

larger; when the association strength is a negative value, it 112 denotes an address indicating the storage location area 
indicates a harder associative relationship as the association 40 of data obtained by actually unfolding the image word 

strength value is larger. For example, an associative word "refreshing". In the example shown in FIG. 11 the storage 

"folkcraft article" corresponding to an index word "simple" location address 112 is linked with unfolded data 115 shown 

m associative data with the associative ID=126533 can be in FIG. 11. 

associated with strength "6" but an associative word "chan- In the data 115, data actually unfolded from "refreshing" 
deher- m associative data with the associative ID- 126536 is 45 i.e., sets of associative words and sensory patterns Znc- 

ShT 3 , f reDgth " 9 " SkCe ltS aSS ° Ciali0D Sp ° ndin S t0 «* number m ° f ser^r^aLms aTe stored 

strength is a negative vatae In ^ embodimenl) seveQ ^ of a ^ 0 P cia[ive words anQ 

ine ; structure of the search result holding unit 216 will be sensory patterns are stored. For example, an associative 

described below with reference to FIG. 10. word 114 is that of the image word "refreshing" and stores 

FIG. 10 shows the data holding format in the search result 50 a character string "forest" in this example. Also, a sensory 

holding unit 216. As described above, the search result pattern ID 113 corresponds to the image word "refreshing" 

holding unit 216 stores image IDs which are found by and its associative word "forest", and stores "5" in this 

searches of the image content word search process 210 using example. The same applies to other sets of associative words 

associative words and the sensory pattern search process and sensory patterns. 

m 55 The struclurc of 106 aforementioned image word/sensory 

Referring to FIG. 10, reference numeral 100 denotes a pattern correspondence holding unit 215 will be described in 

field for storing image IDs found by search; 101, a field for detail below using FIG. 12. 

storing the number of matched associative words with FIG. 12 shows an example of the data structure of the 

Positive association strengths by the image content word image word/sensory pattern correspondence hoioTng unit 

search process 217. An associative word ID list 102 stores 60 215 in FIG. 2. Referring to FIG. 12, reference numeral 120 

™v T I' k W % as f oaatlvc word *=°sory pattern. In FIG. 12, character string? "refreshing- 
nary 211. When the number 101 of matched associative "tropical", and the like are stored, and end wifh NULLcode' 
words is zero, the associative ID 102 is filled with NULL Reference numeral 121 denotes an associative word 

iilf'in f en ° teS 4 fidd f ° r , St0,in8 " Unf0lded fr0m each ™& w ° rd 120 In ^ embodiment, 

n,h 8 ,n Ve ,nr. of ^ c f lve w ° rds wlh associative words "forest", "tableland", "blue sky", and the 

respect to the image IDs 100. When the number 101 of like are stored in correspondence with "refreshing" and 
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these character strings end with NULL code. When no this embodiment, a representative color is used) indicating 
character string is stored in this field 121, i.e., NULL code an image feature extracted from each region or block (Bl, 

alone is stored, this sensory pattern applies to all image B2, . . . , Bm). This embodiment exemplifies a case wherein 

words "refreshing"; no specific associative word has been chromatic image features are extracted, and for example, a 

designated. S plurality of pieces of information C11(R11, Gil, Bll), . '. . 

Reference numeral 122 denotes a sensory pattern ID , Cnl(Rnl, Gnl, Bnl) indicating colors are stored. Refer- 

corresponding to the image word 120 and associative word ence numeral 164 denotes image feature amounts of image 

121. In this embodiment, "005" and "006" are stored as features extracted from the individual regions/blocks. In this 

sensory pattern IDs corresponding to the image word embodiment, ell, . . . , cnl are stored as the feature amounts 

"refreshing" and its associative word "forest". Also, sensory 10 of C11(R11, Gil, Bll), . . . , Cnl(Rnl, Gnl, Bnl). 

patterns for "not refreshing" as an antithetic concept of The structure of the image feature amount/sensory pattern 

"refreshing" are stored. In this embodiment, for "not correspondence holding unit 223 will be described in detail 

refreshing*, no associative words are registered and "001" below using FIG 17 

and"010" are registered as sensory pattern IDs. nG . 17 shows a data cxample of ^ . 

The structure of the above-mentioned sensory pattern 15 feature amount/sensory pattern correspondence holding unit 

holding umt 220 will be described in detail below using FIG. 223 in FIG. 2. Referring to FIG. 17, reference numeral 171 

denotes a sensory pattern ID, which uniquely identifies a 

FIG. 13 shows the data structure of the sensory pattern sensory pattern. Reference numeral 172 denotes image fea- 

holding unit 220 in FIG. 2. Referring lo FIG. 13, reference hire amount data corresponding to each sensory pattern ID. 
numeral 131 denotes an image ID for identifying an image 20 In this embodiment, a sensory pattern is expressed by a 

to be registered. The image IDs 131 use the same ones as chromatic image feature amount, and a combination of color 

those stored in the image holding unit 218, and uniquely components (values in a color space such as RGB, HVC, or 

define images in this system. A field 132 stores sensory the like) corresponding to each sensory pattern Id' is stored, 

pattern IDs. In this embodiment, since the matching levels In this embodiment, RGB values assume integers ranging 
between each image and all sensory patterns stored in the 25 from 0 to 255. A maximum of m colors correspond to each 

image feature amount/sensory pattern correspondence hold- sensory pattern ID. 

ing unit 223 are calculated, all the sensory pattern IDs (1 to The sensory pattern determination process 224 calculates 

m) are stored. Reference numeral 133 denotes a numerical the matching levels between each of image data registered in 

value mdicatmgthe matching level [between each image and the image holding unit 218 and the respective sensory 

sensory 'pattern The matching level assumes a value ranging patterns using the aforementioned image feature amount 

from 0 to 1; 0 mdicates the miage does not match the sensory holding unit 222 and image feature amount/sensory pattern 

pattern at all and the matching level becomes higher as it is correspondence holding unit 223, and registers them in the 

closer to 1. For example m this embodiment, the matching sensory pattern holding unit 220 (to be described later in step 

level between image with the image ID=001 and sensory S87 in FIG. 22). 

pattern 1 is 0.10, and the matching level between that ima<re 35 n,. „,v,™..o . j • • 

and sensory pattern 2 is 0 ntnatima e e The processes executed m the image search apparatus of 

rrm , ' . tnis embodiment with the above arrangement will be 

irie aforementioned image feature amounts will be described below. 

explained in detail below with reference to FIG. 14. ftp. 1 s , fl™„ ~u „ u • .u ? , ■ 

Fir u ,i n , m j, , . • rlO. 18 is a flow chart showmg the operation of the imase 
FIG. 14 shows a data example obtained upon extracting 40 search apparatus of this embodiment. Referring to FIG 18 

me image featare amounts from one image by the image step SI is a processing module mat implement the starch 

feature amount extraction process. In FIG. 14, XI, X2, X3, request inpui process 201 in HG 2 Td toputs a starch 

rep'n /^l Pr r n fl Unag6 v a l. Ur6S ' B1> ?• • ' ' Bm r6preSent ^ No,e *« the of misprL^Tbe exp meS 

regions/blocks from which image feature amounts are in detail later with reference to FIG 19 P 

extracted, and xll to xmn represent image feature amounts ct=„ M ■ , , 

extractedfrom the individualregions/bloL. That is" 45 J% SL * ■ Tn^ ^ at f ^plements the 

amounts that pertain to physical image features XI to Xn Zl ^ domination process 206, and if it is determined 

obtained in units of regions/blocks "^"V?- C0D,eDtS Sl0led " Search wei S ht 

ftp, i <; «.™ m „i,fw u • i. holding umt 204 in the search request input process 201 in 

FIG. 15 exemplifies a case wherein chromatic image step Si th at MmA wei hts are Signaled me designated 
feature amounts are especially extracted In this case, rep- 50 values are stored in the determined 4 hCS» 

%T£? C p°p ™ e f :° f fe f 0DS 01 WOCkS 0n 'heotherhand, if nosearch weightsL des^^d mdex 

222Zi^Tjl^\T e<: ^ h ° lding UDit miDed Weight holdin e unit 207 If *>™ » n ° ind ^ word 80 

\ < t 115,116 n ° " fc relevant to me contents ° f * e l^ry word holding unit 

FIG. 16 shows a storage example of the image feature 202, a default value "5" is stored as both the associated and 
amount holding unit 222 in FIG. 2. Referring to FIG. 16, 60 sensory pattern weights in the determined weight holdin e 

reference numeral 161 denotes an image ID for identifying unit 207. 

an image to be registered The image IDs 161 use the same It is checked with reference to the determined weight 

ones as those stored m the image holding unit 218. Refer- holding unit 207 in step S3 if the associated weS S 

ence numeral 162 denotes a block or region number from If the associated weight is zero, the flow advances^ step S5- 
7^Z n ] m ™\? p" 13 CXt I. aCted -. In ,hiS " oth «wise, the P'°cess in step S4 is executed. Step S4 lis a 

n^h^ r V ' ' " ' " ? m *P resem *e region/block processing module that implements the associative word 

numbers. Reference numeral 163 denotes information (in unfolding process 208 and image content word search 
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process 210 using associative words in FIG. 2, and this FIG. 20 is a flow chart showing the details of the 

process will be described in detail later with reference to associative word unfolding process 208 and the image 

FIG- 20. content word search process 210 using associative words 

It is checked with reference to the determined weight (step S4 in FIG 18) 

SJflf Z^JS, ™ P J 5 * th6 *? S0ry pattern W6ight * 5 Referri "g to HG - 2 °. ^ associative word dictionary 211 

zero. If the sensory pattern weight is zero, since a search k searched ^ the query word 

using sensory pattern is unnecessary, the flow advances to i,„m;„„ „„;. u, ■ ■ j.T " ; y 

step S7; otherwise, the process in step S6 is executed. Step ^° Idmg ^'202 to obtain associative word data in step S41. 

S6 is a processing module that implements the sensory M ° re ^ ^ ^^T^ ^nary 211 is 

pattern unfolding process 212 and sensory pattern search searcned for m de x words 91 (FIG. 9), which match the query 

process 214 in FIG. 2, and will be described in detail later word ' lnd re 8 istered associative word data are extracted. If 

with reference to FIG. 21. h] dex words 91 that match the query word are found, all their 

Step S7 is a processing module that implements the search associative IDs are stored in the unfolded associative word 

result integration process 217 in FIG. 2, and will be holding unit 209. 

described in detail later with reference to FIG. 22. In step S42, the concept discrimination dictionary 205 is 
In step S8, image files corresponding to image IDs stored 15 searched, and if an index word that matches the query word 

in the search result holding unit 216 as search results in the query word holding unit 202 is found, a search 

obtained in step S7 are read out from the image holding unit perspective 81 (FIG. 8) corresponding to that index word is 

218, and are displayed. Note that the search result display extracted. The extracted search perspective is compared 

process in step S8 is a known one which is prevalent in with that stored in the search perspective holding unit 203 in 
image search apparatuses of the same type. 20 step S2S, and if they match, an antithetic concept 82 

FIG. 19 is a flow chart showing the details of the search corresponding to this index word is extracted. On the other 

request input process 201 (step SI in FIG. 18). In step S21, hand, if the two search perspectives do not match, data in 

a query word serving as a search request is input. The query which the query word matches an index word continues to 

word input is attained by storing a character code input at the he searched for. Upon completion of checking for all the 
keyboard 15 in the query word holding unit 202 on the RAM 25 extracted search perspectives, the flow advances to step S43. 

In step S43, the associative word dictionary 211 is 

In step S22, using the query word stored in the query word searched for associative words having an index word, which 

holding unit 202 as a search request, search perspectives that matches the antithetic concept found in step S42. If an index 

are relevant to the search request are extracted from the word that match the antithetic concept is found, their asso- 

concept discrimination dictionary 205. In this case, all ciative IDs are stored in the unfolded associative word 

search perspectives corresponding to index words 80 (FIG. holding unit 209 by appending a status code indicating an 

8), which match the query word in the query word holding antithetic concept thereto. " 

""I' 2 !i'> ™Z extracted ,: For exam P le ' when toe query word [n step S44, associative words are extracted based on the 

is mild , three search perspectives "color tone", "taste", associative IDs stored in the unfolded associative word 

and general atmosphere" can be obtained. holding unit 209, and the image content word holding unit 

It is checked in step S23 if a search perspective or 219 is searched for image content words that match the 

perspectives is or are found by search perspective extraction associative words. The search results are stored in the search 

m step S22. If a search perspective or perspectives is or are result holding unit 216 

ar^tostllr 065 l ° 0therWiSe ' ^ fl ° W <° In pr0CeSS ' tbe v™***" 1Ds stored * th e unfolded 

advances to step i>26. associative word holding unit 209 are extracted, and corre- 

It search perspectives are found m step S22, they are spending associative data are extracted with reference to the 

displayed together witt ithe query word, as in shown in FIG. associative word dictionary 211. Next, the association 

3 in step S24. In step S25, the user selects a desired one of strengths 94 of the extracted associative data are extracted 

the displayed search perspectives using the user interface 45 and are set in a work memory ASCF (not shown) on the 

that has been described previously with reference to FIG. 3. RAM 13. In this case, if a status code indicating an antithetic 

The selected search perspective is stored in the search concept is appended to a given associative ID extracted from 

perspective holding unit 203. the unfolded associative word holding unit 209, the sign of 

In step S26, the user inputs search weights which deter- the association strength is inverted to indicate a negative 

mine the weight balance on a search using associative words so association strength. However, if the association strength is 

and a search using sensory pattern in actual search in relation already a negative value, that associative data is discarded, 

to the search process method in response to the search and the next associative data is checked 

UK" '"/If em ^ 0dim6n ^toe ««r sets the weights using Then, an associative perspective corresponding to each 

the user interface shown m FIG. 4 That is, the user operates associative ID is extracted, and is compared with that stored 

the slide bar shown m FIG. 4 to designate the weight ratios ss in the search perspective holding unit 203 If the two 

(desired search weights) on associative words and sensory perspectives match, a predetermined value a is set in a work 

patterns by the length of the horizontal bar of the slider (the memory VPF (not shown) assured on the RAM 13 If they 

position of the button 41). When the user does not designate do not match, a value axO.l is set in the work memorv VPF 

vleTo^? 18 ^, 116 ,? 1 Shede i^ a l eS ™ of toe d ^ault Finally, the image content word holding unit 219 is 

tLt v h ^ ST? m T L 60 SMrched f0r ™& w ° rds toat match associative 

It is checked in step S27 if search weights are designated. words corresponding to the associative IDs. It an image 

It it is instructed to use the default weight values, the content word is found, an image ID corresponding to that 

processing ends. On the other hand, if search weights are image content word is acquired from the image ID 72 (FIG 

designated, the flow advances to step S2S to store the 7), and is set in the found image ID 100 (FIG 10) in the 

designated associative word and sensory pattern weights 65 search result holding unit 216. "V is set in the number 101 

designated in step S26 in the search weight holding unit 204, of matched associative words, and the found associative ID 

thus ending the processmg. is ^ in ^ associative word ID W2 , value obtajned 



03/29/2004, EAST Version: 1.4.1 



US 6,526,400 Bl 
15 16 

by multiplying the value in the work memories ASCF and results. When the sensory pattern search results include a 

VPF on the RAM 13 by a predetermined score 0 based on sensory pattern based on the antithetic concept to the query 

associative word matching is stored as an associative match- word, the corresponding image is excluded from the inte- 

ing level in the associative matching level 103. If an grated results. Or the sensory pattern matching level of an 
identical image ID has already been stored, the value of the 5 image including an sensory pattern of the antithetic concept 

number 101 of matched associative words is incremented by may be lowered upon integration 

wordTn iTfUZ T d M J* added , t ° ,he , a f Soci 1 a,iv t In this integration process, a method of obtaining com- 

hh ^ ^ Ii, . ^ated associative matching level mon elements of ^ Kts of resu](s m ^ 

undate ts 1 ^ ma 8 t0 aSS0dativ6 WOrds CANDh* search results), a method of 

P 1 V Ue a 10 calculating integrated matching levels based on the weights 

FIG. 21 is a flow chart showing the details of the sensory on the searches, and selecting appropriate search results in 

pattern unfolding process 212, sensory pattern search pro- descending order of integrated matching levels, and the like 

cess 214, and search result integration process 217. are available. In this embodiment, the method of calculating 

As described above, the user inputs a search request for the integrated matching levels will be exemplified below 

searching unages by the search request input process 201. 15 Let A be the associative matching level of an image that 

The search request contains one or a plurality of query matc hes, e.g., an associative word "forest" stored in the 

words, search perspectives, and the like. The query word sca rch result holding unit 216, B be the sensory matching 

input tn this embodiment is an abstract image word that level of an image that matches the sensory pattern ID "005" 

"S^nTfh 1 ! rl° nS , °L lmag K S ^ UCh / S " KiKS ^'> M corresponding to the associative word "forest", and wl and 

warm , and the like. In this embodiment, assume that an ™ w2 (wl + w2=l) be the search weights stored in the deter- 

image word refreshing' is stored. mined weight holding ^ 2ffJ ^ ^ 

Meps S61 and S62 are implemented by the sensory ing level is given by: 
pattern unfolding process 212. In step S61, the image word 

held in the query word holding unit 202 is unfolded into integrated matching levcl-wixA+irtxB 

sensory patterns with reference to the image word/sensory 25 
pattern correspondence holding unit 215. In this 

embodiment, the query word holding unit 202 stores the integrated matching levci=(wixA*wlxtf) ,a 
image word "refreshing", the unfolded associative word 

holding unit 209 holds associative words "forest", ^ mte grated matching levels of all sensory patterns of all 

"tableland", "blue sky", and the like unfolded from 30 associative words are calculated. When one image ID has 

"refreshing", and the image word is unfolded into corre- matching levels larger than zero with respect to a plurality 

sponding sensory pattern IDs with reference to the image of S6nsor y pattern IDs, a plurality of integrated matching 

word/sensory pattern correspondence holding unit 215. For levels are obtained for one image. However, in this case, an 

example, sensory pattern IDs "005" and "006" correspond- imag6 with 106 highest integrated matching level is adopted 

ing to image word "refreshing" — associative word "forest" 35 45 a result (step S65). 

are acquired, and a sensory pattern ID "007" corresponding P roc ess is done for all images corresponding to either 

to image word "refreshing"— associative word "tableland" set of ^^ch results larger than zero, and images whose 

is acquired. integrated matching levels are larger than a predetermined 

The flow then advances to step S62 to store the sets of mresnold valu e X are selected as integrated search results 

unfolded sensory pattern IDs and image words/associative 40 (St f pS S66 ' f? 7 '' md S6g > 

words in the unfolded sensory pattern holding unit 213 The In step S69, the sets of image IDs and their integrated 

data storage in the unfolded sensory pattern holding unit 213 ^atoning levels are stored in the search result holding unit 

is as has already been described previously with reference to ' eading ^ search P mcess - 

FIG. 11. ' An image registration process for registering test images 

The flow advances to step S63. Steps S63 and S64 are 45 b6l ° W r6 . ference to FIG - 22 ' 

implemented by the sensory pattern search process 214 In " 3 flow J chart shewing M ex *mple of a search 

step S63, all image IDs of images having matching levels P rc -P I0C ^ ^ecuted upon registering images. This process 

larger than zero with respect to the sensory pattern IDs ^° tr °" e ^ accordance with a processing program stored 

stored in the unfolded sensory pattern holding unit 213 are <n r„ cei ,i j • 

acquired. This process is done for all the sensory patterns rt, P S "' T des 1 l 6 nates ™ ™ a 8 e to be registered, 

held in the unfolded sensory pattern holding unit 213 Note T to 1 be / e g Ist « ed K designated from those stored 

that the sensory pattern search process 214 acquires image " If™ St ° ra§e deVlCe ' an m P ut device . ™ 

IDs having matching levels larger than zero with respect To Zflt ^fhf IT J 00 ""* 1 ** ^ ima S e P rocessin S 

the sensory pattern IDs respectively unfolded from the query « tT'T' ° f ^ ^ ( none ° f *«n are shown). Id this 

word and antithetic concept. 5S embod ™ent, assume that images serving as test images are 

t„ . c ■ , stored in advance, and the image to be registered is selected 

In step S64, sets of acquired sensory pattern IDs, image from them 'eysierea is selected 

^triria^ssi" step 563 ™ *i£z%^f^*^^ 

tv a .u i „ iu corresponding to the designated image file name, and 

im JemeTeH H t™™ ? ^ S65 * S<9 ™ 60 VarioUS of ^ Nation required for registradon 

imp emented by the search result integration process 217. are acquired, and are supplied to the image feature eSon 

^ tesul ! s ;. lc - **J*W Process221.TheimageIDisstored m conespondencewi°h 
word search resulu using associative words and sensory the image file name to manage an image, .ndfc acquired 

holdini Si?* ' , whlC t h r, held m ^ SCar ? r6Sult searchM S *> *° ima S e aiding unit 218 using uTimage 
holding unit 216 are integrated into one set of search results 65 file name. Various kinds of information of the irnlge include 

weilht h^H ^ f St0rCd m rae detennined Pixel values indicating the width and height of an Lge/me 

w ei ght holding umt 207 with reference to those search number of bits per pixel, the image size Qa units of bytes) 
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the address of the area where a bitmap image is actually search results can be obtained from the results of the two 

stored, and the like, for example, when the file format of this search processes. For this reason, desired image information 

image is the bitmap formal. Since these pieces of image can be accurately extracted. 

information arc stored in the header field of the image file, As described above, according to this embodiment, since 

they can be acquired by referring to the header field. Even 5 a search request (query word, search perspective, and the 

when the file format of the image is not the bitmap format like) which is to be considered upon searching for desired 

but JFIF or FlashPix, required information can be similarly multimedia information can be designated, an appropriate 

obtained from the header field of a file . Or the image holding search can be made in accordance with the designated search 

unit 218 may store such image information, and the image request, and desired image information can be accurately 

information may be acquired by referring to the image 10 extracted. 

holding unit 218 upon registration. According to this embodiment, upon obtaining search 

The flow advances to step S83. Step S83 is implemented results by integrating search results obtained by a search 

by the image feature amount extraction process 221, and using keywords appended to images, and those obtained by 

extracts physical image feature amounts by analyzing the a search using feature amount data of images themselves 

image information corresponding to the designated image is since the weight ratios on the two search processes can be' 

ID. FIG. 15 above shows an example of the image feature changed in correspondence with a query word, the image 

amounts in this embodiment, and representative colors are information wanted can be accurately extracted For 

extracted in units of image regions/blocks. The representa- example, when a keyword "happy" is input as a search 

uvc color may be obtained by using a scheme of analyzing request, it is hard to associate it with image feature amounts 

an actual bitmap image using various kinds of input image 20 since its meaning is essentially lexical. Hence if a search 

information in units of pixels, and calculating the average that attaches importance on the image feature amount is 

uwo a 2 n P™ e » te ( V1 ? ues in a 00101 s P ace awh ^ made, images which do not match the search request are 

i ' HVC - or toe k 1 *) used m 6ach region or block, or a highly hkely to be presented. On the other hand for 

color component with the highest frequency of occurrence example, when a keyword "showy" is input as a search 

as a representative coton 25 request, the keyword "showy" is more likely to evoke 

The flow advances to step S84. In step S84, image feature meanings measurable as image feature amounts For this 

amounts cl to cn extracted in step S83 are stored in the reason, if a search is made while attaching importance on 

image feature amount holding unit 222 in correspondence content words appended to images, images which do not 

with the image ID of this image. The data storage format in match the search request indicated by the input keyword are 

this case is as has already been described previously with 30 highly likely to be presented. Or actually "showy" images 

reference to FIG. 16 may be exchlded from ^ ^ 

The flow advances to step S85, and all sensory pattern IDs according to this embodiment, when a query word "happy- 
stored in the image feature amount/sensory pattern corre- is set via the user interface shown in FIG. 4, heavier weights 
spondence holding unit 223, and image feature amounts are set on associative words; when a query word "showV' is 
corresponding ; to those sensory patterns are acquired with 35 set, heavier weights are set on sensory patterns, thus making 
reference to the image feature amount/sensory pattern cor- an accurate search with respect to either query word of 
respondence holding umt 223. In this embodiment, the course, when the associated weight 83 and sensIryTuem 
seZv 1£T TT 7 '° ^ iDdiVidUal W6ight 84 " CmCe P l discrimination dictionaries are 

Preference to FIG^l 7 Y ^ search can be made by only 

wito reference to FIG. 17. 40 ms tni Cting t0 default wei ht vahl „ me ^ 

The flow advances to step S86, and the matching level interface shown in FIG 4 

between each sensory pattern acquired in step S85 and the In the above embodiment, image information is used as 

^ "-^ponding to this image is cal- stored information serving as test images. As for multimedia 

mfn nt^T^ b l \ Kao ^ P^'tern deter- information (e.g., audio information) other than image 
min&tion process 224. That is, the chromatic image feature 45 information, the present invention can be applied by execut- 

amounts corresponding to each of the sensory patterns ing information feature amount extraction and pairing the 

acquired in step S85 are compared with the image feature extracted information feature amount with sensory patorT 

amounts extracted m step S83 to calculate their matching In the above description, the image holding urn 2lT 

evel. The matching levels for all sensory patterns stored in image content word holding unit 219? and sensory pattern 
toe image feature amount/sensory pattern correspondence 50 holding unit 220 which undergo a search are allocated on the 

holding unit 223 are calculated. Tlie matching level is DISK 14 that builds a single device, but these building 

calcukted using a scheme such as vector computations, components may be distributed on dnTeren devices and 

statotic processes, or the like using cosine measure. processes may be done on the network via the NIC 19 

^ t7 advM1 ? s ^ Slep S8r In ste P S87 > ,h e matching Note that the present invention may be applied to either a 

S£ L^H , S£DS ? ry I 3 ""? ™ 6 faage " SyStem COnStittUed ^ a ohi ^y oi («-g. a host 

;£l;„f S86 are f red * toe sensory pattern computer, an interface device, a reader, a printef, and the 

holding unit 220 m correspondence with the image ID of this like), or an apparatus consisting of a single equipment (e g 

rZIL^;, St0 / age 1 ex 7P k le 10 ^e sensory pattern a copying machine, a facsimile apparatus, or the like), 

ho ding unit 220 .s as has already been described previously The objects of the present invention are also achieved by 

•nT^m, « a r ,. • supplying a storage medium, which records a program code 

re t !j rementl0ned P rocess 15 doae for aU ™»g es to be of a software program that can implement the functions of 

re ^!. e L ~-u a u , . , . ,h e above-mentioned embodiments to the system or 

f h ' 3C ~ rdmg f 10 e mbodiment, a apparatus, and reading out and executing the program code 

search using feature .amount data of multimedia information stored in the storage medium by a computer (or a CPU or 
itself and a search using a content word appended to 65 MPU) of the system or apparatus 

w^H?^ a ° nDati0n a ? r de ,° D ^ baSiS of J associative I" tois case, the program code itself read out from the 

words, which are associated with a query word, and final storage medium implements the functions of the above- 
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mentioned embodiments, and the storage medium which 
stores the program code constitutes the present invention. 

As the storage medium for supplying the program code, 
for example, a floppy disk, hard disk, optical disk, magneto- 
optical disk, CD-ROM, CD-R, magnetic tape, nonvolatile 
memory card, ROM, and the like may be used. 

The functions of the above-mentioned embodiments may 
be implemented not only by executing the readout program 
code by the computer but also by some or all of actual 
processing operations executed by an OS (operating system) 
running on the computer on the basis of an instruction of the 
program code. 

Furthermore, the functions of the above-mentioned 
embodiments may be implemented by some or all of actual 
processing operations executed by a CPU or the like 
arranged in a function extension board or a function exten- 
sion unit, which is inserted in or connected to the computer, 
after the program code read out from the storage medium is 
written in a memory of the extension board or unit. 

As many apparently widely different embodiments of the 
present invention can be made without departing from the 
spirit and scope thereof, it is to be understood that the 
invention is not limited to the specific embodiments thereof 
except as defined in the appended claims. 

What is claimed is: 

1. An information search apparatus for searching infor- 
mation based on an input query word, comprising: 

first search means for determining a query keyword on the 
basis of the query word, and searching information on 
the basis of the query keyword; 

second search means for determining a feature amount 
corresponding to the query word, and searching infor- 
mation on the basis of the feature amount; 

setting means for setting a search weight to be assigned to 
search results of said first and second search means; 
and 

integration means for integrating search results obtained 
by said first and second search means in accordance 
with the search weight set by said setting means. 

2. The apparatus according to claim 1, wherein the search 
weight includes a first weight corresponding to the search 
result of said first search means, and a second weight 
corresponding to the search result of said second search 
means, and 

said integration means applies the first weight to a search 
matching level of each information as the search result 
of said first search means and the second weight to a 
search matching level of each information as the search 
result of said second search means to obtain an inte- 
grated search matching level, and obtains integrated 
search results on the basis of the integrated search 
matching level. 

3. The apparatus according to claim 2, wherein said 
integration means selects a predetermined number of pieces 
of information in descending order of integrated search 
matching level, and determines the selected information as 
the integrated search results. 

4. The apparatus according to claim 1, wherein said 
setting means allows a user to set desired weight ratios with 
respect to the search results of said first and second search 
means. 

5. The apparatus according to claim 1, further comprising: 
a weight dictionary which registers weights correspond- 
ing to said first and second search means in relation to 
the query word, and 

wherein said setting means sets the weights with reference 
to said weight dictionary. 
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6. The apparatus according to claim 5, wherein said first 
search means derives an associative word associated with 
the query word, and uses the query word and the derived 
associative word as query keywords, 

said weight dictionary registers weights in units of asso- 
ciative perspectives that connect query words and asso- 
ciative words in units of query words, and 
said setting means sets the weights with reference to said 
weight dictionary on the basis of the query word and an 
associative perspective designated by a user. 

7. The apparatus according to claim 1, wherein the 
information searched is image data, 

said apparatus further comprises: 

an image content word holding unit for storing the 
image data and content words which verbalize con- 
cepts expressed in the image data in correspondence 
with each other, and 
an associative word dictionary for storing associative 
words associated with the content words, and 
said first search means acquires an associative word 
corresponding to the query word from said associative 
word dictionary, and searches said image content word 
holding unit on the basis of the acquired associative 
word. 

8. The apparatus according to claim 7, further comprising: 
a concept discrimination dictionary for storing index 

words and antithetic concepts corresponding to the 
index words in correspondence with each other; and 
input means for inputting the query word and a search 
perspective, and 

wherein said first search means acquires an index word 
and antithetic concept corresponding to the query word 
from said concept discrimination dictionary on the 
basis of the query word and search perspective input by 
said input means, and acquires an associative word 
corresponding to the query word from said associative 
word dictionary on the basis of the acquired index word 
and antithetic concept. 

9. The apparatus according to claim 1, further comprising: 
a holding unit for storing associative words and sensory 

patterns in correspondence with each other, and 
wherein said second search means acquires a sensory 
pattern corresponding to an associative word, which 
corresponds to the query word, from said holding unit, 
and extracts a feature amount of the acquired sensory 
pattern as the feature amount corresponding to the 
query word. 

10. The apparatus according to claim 1, wherein multi- 
media information is image information, and the feature 
amount is a physical image feature amount obtained by 
analyzing the image information. 

11. The apparatus according to claim 10, wherein the 
feature amount includes at least one of color scheme 
information, composition information, and shape informa- 
tion contained of an image. 

12. An information search method for searching informa- 
tion based on an input query word, comprising: 

a first search step, of determining a query keyword on the 
basis of the query word, and searching information on 
the basis of the query keyword; 

a second search step, of determining a feature amount 
corresponding to the query word, and searching infor- 
mation on the basis of the feature amount; 

a setting step, of setting a search weight to be assigned to 
search results in the first and a second search steps; and 
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an integration step, of integrating search results obtained 
in the first and a second search steps in accordance with 
the search weight set in the setting step. 

13. The method according to claim 12, wherein the search 
weight includes a first weight corresponding to the search S 
result in the first search step, and a second weight corre- 
sponding to the search result in the second search step, and 

the integration step includes the step of applying the first 
weight to a search matching level of each information 
as the search result in the first search step and the 1° 
second weight to a search matching level of each 
information as the search result in the second search 
step to obtain an integrated search matching level, and 
obtaining integrated search results on the basis of the 
integrated search matching level. is 

14. The method according to claim 13, wherein the 
integration step includes the step of selecting a predeter- 
mined number of pieces of information in descending order 
of integrated search matching level, and determining the 
selected information as the integrated search results. 20 

15. The method according to claim 12, wherein the setting 
step includes the step of allowing a user to set desired weight 
ratios with respect to the search results in the first and second 
search steps. 

16. The method according to claim 12, wherein the setting 25 
step includes a step of setting the weights with reference to 
a weight dictionary which registers weights corresponding 
to the first and second search steps in relation to the query 
word. 

17. The method according to claim 16, wherein the first 30 
search step includes a step of deriving an associative word 
associated with the query word, and using the query word 
and the derived associative word as query keywords, 

the weight dictionary registers weights in units of asso- 
ciative perspectives that connect query words and asso- 35 
ciative words in units of query words, and 

the setting step includes a step of setting the weights with 
reference to the weight dictionary on the basis of the 
query word and an associative perspective designated 
by a user. " 40 

18. The method according to claim 12, wherein the 
information searched is image data, and said method is 
performed using: 

an image content word holding unit for storing the image 45 
data and content words which verbalize concepts 
expressed in the image data in correspondence with 
each other; and 

an associative word dictionary for storing associative 
words associated with the content words, and so 

wherein the first search step includes a step of acquiring 
an associative word corresponding to the query word 
from the associative word dictionary, and searching the 
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image content word holding unit on the basis of the 
acquired associative word. 

19. The method according to claim 18, wherein said 
method is performed using a concept discrimination dictio- 
nary for storing index words and antithetic concepts corre- 
sponding to the index words in correspondence with each 
other; and 

wherein said method further comprises an input step, of 
inputting the query word and a search perspective, and 

wherein the first search step includes a step of acquiring 
an index word and antithetic concept corresponding to 
the query word from the concept discrimination dic- 
tionary on the basis of the query word and search 
perspective input in the input step, and acquiring an 
associative word corresponding to the query word from 
the associative word dictionary on the basis of the 
acquired index word and antithetic concept. 

20. The method according to claim 12, wherein said 
method is performed using a holding unit for storing asso- 
ciative words and sensory patterns in correspondence with 
each other, and 

wherein the second search step includes a step of acquir- 
ing a sensory pattern corresponding to an associative 
word, which corresponds to the query word, from the 
holding unit, and extracting a feature amount of the 
acquired sensory pattern as the feature amount corre- 
sponding to the query word. 

21. The method according to claim 12, wherein multime- 
dia information is image information, and the feature 
amount is a physical image feature amount obtained by 
analyzing the image information. 

22. The method according to claim 21, wherein the feature 
amount includes at least one of color scheme information, 
composition information, and shape information contained 
of an image. 

23. Astorage medium for storing a control program which 
makes a computer search information based on an input 
query word, said control program comprising: 

a code of the first search step of determining a query 
keyword on the basis of the query word, and searching 
information on the basis of the query keyword; 

a code of the second search step of determining a feature 
amount corresponding to the query word, and searching 
information on the basis of the feature amount; 

a code of the setting step of setting a search weight to be 
assigned to search results in the first and second search 
steps; and 

a code of the integration step of integrating search results 
obtained in the first and second search steps in accor- 
dance with the search weight set in the setting step. 
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FIG. 12 
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INFORMATION SEARCH APPARATUS AND 
METHOD, AND COMPUTER READABLE 
MEMORY 

BACKGROUND OF THE INVENTION 
The present invention relates to an information search 
apparatus and method for searching information on the basis 
of an input query word. 

A conventional information search apparatus, which 
searches multimedia information, e.g., image information, 
makes a search using data (keywords) derived from subjec- 1 
tive evaluation results of one or a plurality of persons for 
images to be searched, physical image features extracted 
from images, and the like. 

Also, an image search apparatus that obtains a required 
image by matching a given keyword with that corresponding 
to an image has been realized. Furthermore, an information 
search apparatus, which obtains an image, that cannot be 
obtained by full-word matching with an input keyword, by 
matching not only the input keyword but also an associated 
keyword associated with the input keyword with a keyword 2 
corresponding to an image, has also been realized. 
Moreover, an information search apparatus which obtains an 
image with similar color information by detecting a corre- 
spondence between the input keyword and color information 
using, e.g., color information of images is proposed. /- 

For example, in one scheme, an impression that a person 
receives upon watching an image, or key information linked 
with the impression is appended to image information and is 
used in search. As the key information, words that express 3I 
impressions evoked by images such as "warm", "cold", and 
the like, and words that represent objects in drawn images 
such as "kitty", "sea", "mountain", and the like are appended 
as keywords. Also, local image feature components on 
drawn images are subjectively evaluated and are often 3< 
appended as key information. For example, information that 
pertains to a color such as "red", "blue", and the like, 
information that pertains to a shape such as "round", 
"triangular, "sharp", and the like, and information that 
pertains to a texture such as "sandy", "smooth", and the like 4C 
are expressed using words and icons, are appended to 
images as key information, and are used in search. 

In another system, physical image feature amounts are 
extracted from images, and are used in image search. Image 
features include local colors painted on images, overall color 45 
tones, and shapes, compositions, textures, and the like of 
objects on drawn images. An image feature amount is 
extracted from segmented regions or blocks obtained by 
segmenting the overall image into regions based on color 
information, or segmenting the image into blocks each 50 
having a given area, or is extracted from the entire image. 
Physical image features include, e.g., color information, 
density distribution, texture, edge, region, area, position, 
frequency distribution, and the like of an image. 

However, in the prior art, when an image including a 55 
keyword that matches the input query word is searched for, 
images which do not match the search request of the 
searcher are often obtained. Especially, when an image 
search is made using an abstract query word such as a 
"refreshing" image, images found by the search are limited. 60 
To solve this problem, a search may be made by unfolding 
the query word "refreshing" to keywords which are associ- 
ated with that query word. However, when such scheme is 
used, images which are not "refreshing" may be mixed in 
search results. 6J 

In place of query words, a query image may be input, and 
a search may be made using the feature amount of the input 
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image. In this case, a query image that reflects the searcher's 
will must be prepared, and it is difficult to select a query 
image, resulting in poor operability. 

SUMMARY OF THE INVENTION 

The present invention has been made in consideration of 
the above-mentioned problems, and has as its object to 
provide an image search method and apparatus which can 
extract desired information with high precision with respect 
1 to an input query word. 

In order to achieve the above object, an image search 
apparatus according to the present invention comprises the 
following arrangement. 

That is, there is provided an information search apparatus 
for searching information based on an input query word, 
comprising: 

first search means for detennining a query keyword on the 
basis of the query word, and searching information on 
the basis of the query keyword; 
second search means for determining a feature amount of 
a pattern corresponding to the query word, and search- 
ing information on the basis of the feature amount; and 
integration means for integrating search results obtained 

by the first and second search means. 
In order to achieve the above object, an image search 
method according to the present invention comprises the 
following arrangement. 

That is, there is provided an information search method 
for searching information based on an input query word, 
comprising: 

the first search step of determining a query keyword on 
the basis of the query word, and searching information 
on the basis of the query keyword; 
the second search step of determining a feature amount of 
a pattern corresponding to the query word, and search- 
ing information on the basis of the feature amount; and 
the integration step of integrating search results obtained 

in the first and second search steps. 
In order to achieve the above object, an. image search 
apparatus according to the present invention comprises the 
following arrangement. 

That is, there is provided an information search apparatus 
for managing a plurality of kinds of multimedia information, 
and searching the managed multimedia information for 
desired multimedia information, comprising: 
a content word holding unit for storing the multimedia 
information, and content words which verbalize con- 
cepts expressed in the multimedia information in cor- 
respondence with each other; 
an associative word dictionary for storing the content 
words and associative words which are associated with 
the content words in correspondence with each other; 
input means for inputting a query word; 
first search means for acquiring an associative word 
corresponding to the query word input by the input 
means from the associative word dictionary, and 
searching multimedia information on the basis of the 
acquired associative word; 
extraction means for extracting a feature amount corre- 
sponding to the query word input by the input means; 
second search means for searching multimedia informa- 
tion on the basis of the feature amount extracted by the 
extraction means; and 
integration means for integrating search results obtained 
by the first and second search means. 
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In order to achieve the above object, an image search 
method according to the present invention comprises the 
following arrangement. 

That is, there is provided an information search method 
for managing a plurality of kinds of multimedia information, 
and searching the managed multimedia information for 
desired multimedia information, comprising: 

the storage step of storing on a storage medium a content 
word holding unit for storing the multimedia 
information, and content words which verbalize con- 
cepts expressed in the multimedia information in cor- 
respondence with each other, and an associative word 
dictionary for storing the content words and associative 
words which are associated with the content words ia 
correspondence with each other; 
the input step of inputting a query word; 
the first search step of acquiring an associative word 
corresponding to the query word input in the input step 
from the associative word dictionary, and searching 
multimedia information on the basis of the acquired 
associative word; 

the extraction step of extracting a feature amount corre- 
sponding to the query word input in the input step; 
the second search step of searching multimedia informa- 
tion on the basis of the feature amount extracted in the 
extraction step; and 
the integration step of integrating search results obtained 

in the first and second search steps. 
In order to achieve the above object, a computer readable 
memory according to the present invention comprises the 
following arrangement. 

That is, there is provided a computer readable memory for 
storing a program code of an information search process for 
managing a plurality of kinds of multimedia information, 
and searching the managed multimedia information for 
desired multimedia information, comprising: 

a program code of the storage step of storing on a storage 
medium a content word holding unit for storing the 
multimedia information, and content words which ver- 
balize concepts expressed in the multimedia informa- 
tion in correspondence with each other, and an asso- 
ciative word dictionary for storing the content words 
and associative words which are associated with the 
content words in correspondence with each other; 
a program code of the input step of inputting a query 
word; 

a program code of the first search step of acquiring an 
associative word corresponding to the query word input 
in the input step from the associative word dictionary, 
and searching multimedia information on the basis of 
the acquired associative word; 

a program code of the extraction step of extracting a 
feature amount corresponding to the query word input 
in the input step; 

a program code of the second search step of searching 
multimedia information on the basis of the feature 
amount extracted in the extraction step; and 

a program code of the integration step of integrating 
search results obtained in the first and second search 
steps. 

In order to achieve the above object, an image search 
apparatus according to the present invention comprises the 
following arrangement. 

That is, there is provided an information search apparatus 
for managing a plurality of kinds of multimedia information, 
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and searching the managed multimedia information for 
desired multimedia information, comprising: 

a content word holding unit for storing the multimedia 
information, and content words which verbalize con- 
cepts expressed in the multimedia information in cor- 
respondence with each other; 

an associative word dictionary for storing the content 
words and associative words which are associated with 
the content words in correspondence with each other; 

input means for inputting a query word; 

a concept discrimination dictionary for storing index 
words corresponding to the query word and search 
perspectives pertaining to the index words in corre- 
spondence with each other; 

display means for extracting search perspectives pertain- 
ing to an index word corresponding to the query word' 
input by the input means from the concept discrimina- 
tion dictionary, and displaying the extracted search 
perspectives; 

designation means for designating a desired one of the 
search perspectives displayed by the display means; 

first search means for acquiring an associative word 
corresponding to the query word input by the input 
means from the associative word dictionary, and 
searching multimedia information on the basis of the 
acquired associative word; 

second search means for extracting a feature amount 
corresponding to the query word input by the input 
means, and searching multimedia information on the 
basis of the extracted feature amount; and 

integration means for integrating search results obtained 
by the first and second search means on the basis of the 
search perspective designated by the designation 
means. 

In order to achieve the above object, an image search 
method according to the present invention comprises the 
following arrangement. 

That is, there is provided an information search method 
for managing a plurality of kinds of multimedia information, 
and searching the managed multimedia information for 
desired multimedia information, comprising: 
the input step of inputting a query word; 
the storage step of storing on a storage medium a content 
word holding unit for storing the multimedia 
information, and content words which verbalize con- 
cepts expressed in the multimedia information in cor- 
respondence with each other, an associative word dic- 
tionary for storing the content words and associative 
words which are associated with the content words in 
correspondence with each other, and a concept dis- 
crimination dictionary for storing index words corre- 
sponding to the query word and search perspectives 
pertaining to the index words in correspondence with 
each other; 

the display step of extracting search perspectives pertain- 
ing to an index word corresponding to the query word 
input in the input step from the concept discrimination 
dictionary, and displaying the extracted search perspec- 
tives; 

the designation step of designating a desired one of the 
search perspectives displayed in the display step; 

the first search step of acquiring an associative word 
corresponding to the query word input in the input step 
from the associative word dictionary, and searching 
multimedia information on the basis of the acquired 
associative word; 
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the second search step of extracting a feature amount FIG. 4 is a table showing the structure of an unfolded 

corresponding to the query word input in the input step, sensory pattern holding unit in the embodiment of the 

and searching multimedia information on the basis of present invention; 

the extracted feature amount; and FIG. 5 is a table showing the structure of a sensory pattern 

the integration step of integrating search results obtained S holding unit in the embodiment of the present invention; 

in the first and second search steps on the basis of the cm c :.. - .„ui u i c ■ «• 

. j • . , ■ K, , . . rIG. 6 is a table showing an example of image feature 

search perspective designated in the designation step. „ m „„„,c ,„ „„k a- T ex. . ■ ■ 

,„ . ,„„„5- . . ^ 77, amounts in the embodiment of the present invention: 

In order to achieve the above object, a computer readable „_ _ . r ' 

memory according to the present invention comprises the FIG ' V s 3 Uble showin g the structure of an image feature 

following arrangement. 10 imomlt & sensory pattern holding unit in the embodiment of 

That is, there is provided a computer readable memory for me present mvenlion ; 

storing a program code of an information search process for FIG. * shows a display example of a search perspective 

managing a plurality of kinds of multimedia information, ^P 11 ' ^ a sea rch request input processing unit in the 

and searching the managed multimedia information for embodiment of the present invention; 
desired multimedia information, comprising: is FIG. 9 shows a display example on a control panel upon 

a program code of the input step of inputting a query Instructing search weights in the embodiment of the present 

word; invention; 

a program code of the storage step of storing on a storage FIG. 10 is a table showing the structure of an image 

medium a content word holding unit for storing the holding unit in the embodiment of the present invention; 
multimedia information, and content words which ver- 20 FIG. 11 is a table showing the structure of an image 

balize concepts expressed in the multimedia informa- content word holding unit in the embodiment of the present 

tion in correspondence with each other, an associative invention; 

word dictionary for storing the content words and HG . 12 is a table showin another 

associative words which are associated with the content structure of an ^ content word holdiTS in the 

wordsmcorrespondencewitheachother.andaconcep, 15 embodiment of the present invention; § 

discrimination dictionary for stormg mdex words cor- c, n • , . , 

responding to the query word and search perspectives A - * V snowin S ^ structure of a concept 

pertaining to the index words in correspondence with fccnmlnan °a dictionary in the embodiment of the present 

each other; invention; 

a program code of the display step of extracting search 30 FIG ' 14 is a lable snowill g the structure of an associative 

perspectives pertaining to an index word corresponding W ° rd dictionar y « the embodiment of the present invention; 

to the query word input in the input step from the FI . G - 15 is a table showing the structure of a search result 

concept discrimination dictionary, and displaying the holding unit in the embodiment of the present invention; 
extracted search perspectives; " ' 3j FIG. 16 is a table showing another example of feature 

a program code of the designation step of designating a amounts in the embodiment of the present invention; 

desired one of the search perspectives displayed in the FIG. 17 is a table showing the structure of an image 

display step; feature amount holding unit in the embodiment of the 

a program code of the first search step of acquiring an present invention; 
associative word corresponding to the query word input 40 FIG. 18 is a flow chart showing processes executed in the 

m the input step from the associative word dictionary, embodiment of the present invention- 

and searching multimedia information on the basis of FIG 19 is a flow rhart chmvinir h»i1-i f u 

the acquired associative word; in , , & ™ chart showing details of a search request 

» „„„„„ „f ,u j . * m P 1lt Process in the embodiment of the present invention- 

a program code of the second search step of extracting a err in- a u - L • , ., , mvcmlun > 

feature amount corresponding to the query word input 45 15 1 H ° W chart details of a search process 

in theinputstep,andse P arch4multimTai7MormS ™2Z ™ " emb0dimen, ° f ^ P resent 

on the basis of the extracted feature amount; and ' . 

a program code of the integration step of integrating FIG. 21 is a flow chart showmg details ofa search process 

search results obtained in the first and second search usln S a sensor y pattern in step S3006 and a search result 

steps on the basis of the search perspective designated 50 combuun 8 P rocess m ste P S3 «>7 in the embodiment of the 

in the designation step. P resenl 'nvention; and 

Other features and advantages of the present invention ^ G - 22 is a flow chart showing an image registration 

will be apparent from the following description taken in process in the embodiment of the present invention, 

conjunction with the accompanying drawings, in which like riFSrRTPTrnw nu -run DDcrcon™ 

reference characters designate the same or similar parts 55 p^nn,S EFERRED 

throughout the figures thereof. EMBODIMENTS 

BRIEF DESCRIPTION OF THE DRAWINGS u 1 ? e preferred embodiments of the present invention will 
, . be described in detail hereinafter with reference to the 
MU. 1 is a block diagram showing the arrangement of an accompanying drawings, 
information search apparatus according to an embodiment of <n Fir. 1 «hnu« th* sm „„» m „, r ■ c 
the present invention- 6 ^rmgement of an information search 
np1 . ,, .,' . . , „ apparatus according to an embodiment of the present inven- 
t-to. 2, is a block diagram showing the functional arrange- tion 

e m xitVoftr P t^^r to ^ 10 j*£2Lo to rcSn 1 * ?n M u d ™« s a 

Fir 1 ; c » „m 1, ■ .1. r microprocessor (CPU), which makes computations, logical 

„.£ / 1S . a ,. tabU s J own « the structure of a sensory 65 decisions, and the like for image information search in 

patu.rn/assoaa ive word correspondence holding unit in the accordance with control program!, and controls indivMuM 

embodiment of the present invention; building components (o ^ address b ^' 
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control bus CB, and data bus DB via these buses. The supplied from the NIC 19 via the network. Upon receiving 

address bus AB transfers an address signal indicating the the input from the KB IS or NIC 19, an interrupt signal is 

building component to be controlled by the CPU 11. The sent to the CPU 11. Upon receiving the interrupt signal, the 

control bus CB transfers and applies a control signal for each CPU 11 reads out various control signals stored in the DISK 

building component to be controlled by the CPU 11. The s 14, and executes various kinds of control in accordance with 

data bus DB transfers data among the respective building these control signals. Also, the present invention is achieved 

components. by supplying a storage medium that stores a program 

Reference numeral 12 denotes a read-only memory according to the present invention to a system or apparatus, 

(ROM), which stores control programs such as a boot and by reading out and executing program codes stored in 
processing program executed by the CPU 11 upon starting 10 the storage medium by a computer of the system or appa- 

up the apparatus of this embodiment, a processing program ratus. 

executed in this embodiment, and the like. Reference The functional arrangement of the information search 

numeral 13 denotes a rewritable random access memory apparatus of this embodiment will be explained below with 

(RAM) which is configured by 16 bits per word, and is used reference to FIG 2 

Zm^ZI^Ta? ^M« fi T ^ reSPeCt t e 15 FIG^isablockdiagramshowingthefunctionalarrange. 

budding components. Also, the RAM 13 stores a query word ment of ^ ^ch apparatus according to The 

w^idhfh^H ^^perspective holding unit 203, search embodiment of the present invention. 8 

weight holding unit 204, determined weight holding unit d,w;„„ ,„ err if 

207, unfolded associative word holding unit 209, unfolded deferring «*> FIG. 2, reference numeral 201 denotes a 

sensory pattern holding unit 213, and search result holding searc <> «quest input processing unit for inputting query 

unit 216, which will be described later with reference to FIG {qW J * ; arch P 6IS P ectlye °r category, search 

2. weight, and the like) that pertain to the information wanted, 

u.f , * , , , Reference numeral 202 denotes a query word holdine unit 

flS whirr , 6X T al „ mem ° ry for St0rin 6 a W word ^ ™* request 

(DISK), winch stores a concept d^cmnmation dictionary processing unit 201. Reference numeral 203 denotes a 

tlT^ T ?T ry 2U / ^ lg6 WOrd/sensor y ««* P^tive holding unit for storing a search perjec! 

pattern correspondence holding unit 215, image content inm.t h„ a. „„,,j. . ■ v ■ ul. 

word holding unit 219, image holding unit 218 sensory I f P * i -,™ V^ mpU ' P rocessm g ™* 201 

„«m»™i, w •. nn • UU1UUJ 6 ™i sensory Reference numeral 204 denotes a search weight holdins unit 

pattern holding umt 220, image feature amount holdine unit f nr d™„ u • • . , v UOKUU g "D" 

O'yy QnH innnfl fc #1 w uuiuiag urn i or storing a search weight mput by the search request input 

222, and image feature amount/sensory pattern correspon- processine unit 201 'cqucsi mpui 

dence holding unit 223, which will be described later with 30 D f 

reference to FIG. 2. Also, the external memory 14 stores A - R f eKnc f numeral 205 denotes a concept discrimination 

programs for respectively implementing processing units dictlonar y ha ™g a sea rcb perspective that pertains to a 

i.e., a search request input processing unit 201 weieht u° nCep 45 ^ mfon nation wanted, an antithetic concept 

determination processing unit 206, associative word unfold- ~ g - 3 "T" 7 ° r anton y mous meaning, and two kinds of 
ing processing unit 208, image content word search unit 210 35 coefllcl ( en i s f> r w<>1 & 1 discrimination upon searching for a 

using associative words, sensory pattern unfolding process- Reference n T eral 206 denotes a determi- 

ing unit 212, sensory pattern search processing unit 214 na ' IM P rocessm S unit for S™ng weights (associated weight 

search result integration processing unit 217, image feature vindicating the weight balance on 

amount extraction processing unit 221, and sensory pattern ass0Clauvc words (obtained by an associative word unfold- 
determination processing unit 224, which will be described 40 mg processln S UQlt 20s ) and sensory patterns (obtained by 

later with reference to FIG. 2. As a storage medium for * patt6m mMdin S processing unit 212) upon 

storing these programs, a ROM, floppy disk, CD-ROM ^ g ^ WOrd stored m me 1 uer y word 

memory card, magnetooptical disk, or the like can be used' ! S ^'V^. , * efcrence numcral 207 denotes a deter- 

Referencenum e rall5denotesakeyboard(KB)whichhas Zrl^f ^ ^ Weight 

bol input keys for inputting a period, comma, and the 4e „„mh d 208 m associat " e w °' d 

a search key for instructing anarch (a function key on a P rocessm g ^Idmg the query word 

generdkeyboardmaybeusedmstead),andvariousfunction 7^ 7 g Umt2 ° 2 mt ° associative 

keys such as cursor moving keys for instructing cursor W ° rdswith reference }° ™ associative word dictionary 211, 

movement, and the like. Also, a pointing device such as a 50 °*T f Dg ac . antlthetic ( concept antonymous to that query 

mouse or the like (not shown) may be connected word from the concept discrimination dictionary 205, and 

n c > , , , unroloing the antithetic concept into associative words with 

Reference numeral 16 denotes a dtsplay video memory reference to the associative word dictionary 211. RefoeTce 

CvTMM) for stonng a pattern of data to be displayed. numeral 209 denotes an unfolded associative word ho Z 

Reference numeral 17 denotes a CRT controller (CRTC) for 55 unit for holding the associative words (including those oi S 

displaying the contorts stored m Uie VRAM 16 on a CRT 18. antithetic concept) unfolded by the urfM. 

el ^lT Ctai ll dSD °X S \t SPl % d f CC (CRT) ^ ^ P™™^ ™* 20S - R^-n- numeral 210 denoK, ft 

e.g., a cathode ray tube, or the like. The dot display pattern image content word search processing unit using *S™ 

CRTC^t? h y ,° n C Y ^ a ? C0Dtr0 . lled by thC w °^.wrtichfind^imagecontentwof^,which 8 ^S 

CRTC 17. Note that various other displays such as a liquid 60 an image content word holding unit 219 and maton!," 

delS S y " ^ no?' bC ^ 15 ^ ^ Unf0ldcd ^ ^ sLch wim reference d he 

device. Reference numeral 19 denotes a network controller unfolded associative word holding unit 209 

(IN1(_;, which connects the apparatus to a network such as d„«- _ 

Ethernet or the like ""worn sucn 3s Reference numeral 211 denotes an associative word dic- 

■n, ■ f .. . tionary for storing associative words to be unfolded in units 

afo^men.^ h,n.H constn,cte ? b y me 65 ofconcepte serving as index words in correspondence with 

aforemenuoned bunding components operates m accor- associative perspectives. Reference numeral 212 denotes a 

dance with vanous inputs from the KB 15 and various inputs sensory pattern unfolding processing unit for unfoldmg the 
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query word stored in the query word holding unit 202 into sensory pattern correspondence holding unit for storing 

sensory patterns with reference to an image word/sensory image feature amounts and sensory patterns in correspon- 

pattern correspondence holding unit 215, obtaining an ami- dence with each other, i.e., storing sensory pattern IDs and 

thetic concept antonymous to the stored query word from the image feature amounts corresponding to those IDs. Note that 

concept discrimination dictionary 205, and unfolding the s FIG. 7 shows a data storage example of the image feature 

obtained antithetic concept into sensory patterns with refer- amount/sensory pattern correspondence holding unit 223. 

ence to the image word/sensory pattern correspondence The image feature amount/sensory pattern correspondence 

holding unit 215. holding unit 223 will be described in detail later. 

Reference numeral 215 denotes an image word/sensory Reference numeral 224 denotes a sensory pattern deter- 
pattern correspondence holding unit for storing image words 10 mination processing unit for comparing a sensory pattern 

and sensory patterns in correspondence with each other, i.e., and image feature amount extracted from image information 

storing image words and sensory pattern IDs corresponding to obtain their matching level with reference to the image 

to associative words, which are associated with the image feature amount/sensory pattern correspondence holding unit 

words. Note that FIG. 3 shows a data storage example of the 223, and registering the matching level in the sensory pattern 

image word/sensory pattern correspondence holding unit 15 holding unit 220. 

215. The structure of the image word/sensory pattern cor- A display example of a search perspective that pertains to 

respondence holding unit 215 mil be described in detail search request items input at the search request input pro- 

ater ' cessing unit 201 will be explained below with reference to 

Reference numeral 213 denotes an unfolded sensory FIG. 8. 

pattern holding unit for temporarily storing the sensory 20 FIG. 8 shows a display example of a search perspective 

patterns unfolded by the sensory pattern unfolding process- input at the search request input processing unit in the 

mg umt 212. The unit 213 is stored in the RAM 13. Note that embodiment of the present invention 

FIG. 4 shows a data storage example of the sensory pattern When a query word is input by operating, eg the 

^^JT^^^^T^^T? 1 ? 25 k6 y boari 15 ' »™*P« discrimination dicLaJ 205 

detaiHater Pr ° C6SSmg 213 ^ be deSCnbed in shown to HG. 13 * searched using the query word as an 

index word to extract corresponding search perspectives 

Reference numeral 214 denotes a sensory pattern search FIG. 8 illustrates that three search perspectives "color 

processing unit for finchng sensory patterns, which are tone", "taste", and "general atmosphere" are available in 

S Patt6 ?? ^ 8 ^ 220 d T Similar 30 relation to » W ^rd "mild", and hatched "color tone" is 
to the unfolded sensory patterns by search with reference to selected as the search perspective. When the user presses an 
the sensory pattern ho ding unit 220. Reference numeral 217 OK button in this state, the search perspective "color tone- 
denotes a search result integrauon processing unit for inte- is selected, and is held in the search perspective holding umt 
grating the search result of mage content words using the 203. Also, the query word "mild" is held in the query word 
associative words, and the search results of sensory patterns holding unit 202 

stored in a search result holding unit 21 6, on the basis of the 3S Rvnr«aW n ™ n f ( i,. ™„ i 

search weights obtained by the weight determination V Z hn Z ffTf, ? h 7 TT g ^ ° D ^ ^ 

cessing unit 206 "imiuduon pro board 15, the hatching moves from "color tone" to "taste" or 

D e , -in j . "general atmosphere", and the user can designate a desired 

Keierence numeral 219 denotes an image content word search perspective or category 

fa&?f«™^ ,, ^ ,,?ld St ° ring TS?* 40 Adis ^y exam P le ° n «" C0Ilt «» P«">1 when the operator 

m image informauon stored in an ,mage holding umt 218. instructs the search weight balance on a search using alo 
Reference numeral 218 denotes an image holding unit for ciative words ^ a se £h using sensory patterns! acfcal 

n™ in H T SCrV1Dg 33 ^ Tf S - RefCTCQCe search ^ be «P*«*d Mow with reference toFIG 9 a! 

numeral 220 denotes a sensory pattern holding unit for described above, a search using associative words and a 

holding sensory patterns obtained from the image informa- 45 search using the feature amounts of imZl fsensorv 

Uon stored in the image holding unit 218, and storing patterns) baJd on the query Tordte mad ^ and the 3 

matching levels with respecuve sensory patterns in units of results are integrated. In this integration piKus TthTlwo 

image Ds each indicating image information. Note that scarch rcsults * weighted . 0 control S' uSuTr 

unS 220 Z a ^T?,Z XlmV ° f ^ ™Tl h0ldi °? Can d£si ^ a weight for a search us?iS 

S S.XE^inlJSl'Sr Pattem h ° ldlDg 50 r 35 ' »* r fol 3 ^ SeDSOry Patt6mS - ^ * 

uc^noeu in aetau later. me user can designate me weign , halince on a search usi 

Keierence numeral 221 denotes an image feature extrac- associative words and that using sensory patterns in actual 

uon processing unit for extracting physical image feature search 

a nT^> llT if 0 ™*™ St0red ta me ima S e hold - FIG. 9 shows a display example of the control panel upon 

mg unit 218. Physical image feature amounts are visual 55 instructing search weights in the embodiment of the present 

features or signatures extracted from regions segmented on invention present 

Sve^Sven^a !fS?!2!i£ "if '° Referring to FIG. 9, when the user slides a slide button 41 

^ 6 ^V"^ fea,ure to the left, an instruction that sets a heavier weight on a 
amount is, e.g numerical information such as the color search associative words is issued" when hf or she 

distribution or histogram, density distribution, texture, edge, 60 slides ^ Jtk button 41 to mfrig^t Ssu^L tha sets 

Keierence numeral 222 denotes an image feature amount instruction is issued. A button 42 in the display areais 

holding unit for storing the image feature amounts obtained 65 pressed when no search weights are clearly designated and 

by the image feature amount extraction processing unit 221. in such case, a predetermined search weight instruction is 

Reference numeral 223 denotes an image feature amount/ issued. Upon depression of the button 42, predetermined 
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weight values (which are obtained from an associated 
weight 83 and sensory pattern weight 84 in the concept 
discrimination dictionary 205) are used. The set weights are 
stored in the search weight holding unit 204. Note that the 
buttons 41 to 43 on the control panel may be clicked by a 
pointing device (not shown). 

The structure of the image holding unit 218 will be 
described below using FIG. 10. 

FIG. 10 shows the structure of the image holding unit in 
the embodiment of the present invention. 

The image holding unit 218 manages image information 
by storing image IDs each indicating image information 
(image files) and image file storage paths each indicating the 
storage location of image information. Referring to FIG. 10, 
reference numeral 2180 denotes an image ID which is 
uniquely assigned to one image file. Reference numeral 
2181 denotes a file path which indicates the storage location 
of an image file corresponding to the image ID in the DISK 
14, and corresponds to the directory and file of MS-DOS. 

An image file is divided into header and image data fields 
(not shown in FIG. 10). The header field stores information 
required for reading data from that image file, and additional 
information that explains the image contents. As such 
information, an image format identifier indicating the image 
format name of the image, file size, image width, height, and 
depth, the presence/absence of compression, color pallet 
information, resolution, offset to the storage location of 
image data, and the like are stored. The image data field 
stores image data in turn. This embodiment uses the BMP 
format of Microsoft Corp. as such image format, but other 
compression formats such as GIF, JPEG, FlashPix, and the 
like may be used. 

The structure of the image content word holding unit 219 
will be described below with the aid of FIG. 11. 

FIG. 11 shows the structure of the image content word 
holding unit in the embodiment of the present invention. 

The image content word holding unit 219 manages image 
information by storing the image IDs and image content 
words in correspondence with each other. Referring to FIG. 40 
11, reference numeral 21900 denotes a field for storing 
image IDs corresponding to the image IDs 2180 shown in 
FIG. 10; and 21901, a field for storing image content words 
that express image files corresponding to the image IDs 
21900. The image content word verbalizes an image feature 45 
expressed in an image file, and stores a keyword as a 
character code (e.g., Unicode). A plurality of keywords may 
be stored per image file, and the image content word holding 
unit 219 is expressed as a list of image content words 21901 
using image IDs 21900 as keys. Or, as shown in FIG. 12, the 50 
image content word holding unit 219 may be expressed as a 
list of image IDs 21911 using image content words 21910 as 
keys. 

FIG. 12 shows a table which stores data of the image 
content word holding unit 219 shown in FIG. 11 as a list of 55 
image IDs using image content words as keys. Referring to 
FIG. 12, all image IDs 21911 that contain the individual 
words of image content words 21910 as keywords are 
stored. Note that FIG. 11 shows classification based on 
image IDs, and FIG. 12 shows classification based on image 60 
content words. Therefore, since FIGS. 11 and 12 have the 
same contents, both the tables need not always be held. 

The structure of the concept discrimination dictionary 205 
will be described below using FIG. 13. 6S 

FIG. 13 shows the structure of the concept discrimination 
dictionary in the embodiment of the present invention. 
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As shown in FIG. 13, the concept discrimination dictio- 
nary 205 provides information that pertains to a query word 
serving as a search request, and stores index words 2050 
corresponding to query words, search perspectives 2051 
5 associated with index words 2050, antithetic concepts 2052 
having meanings contrary to the index words 2050, associ- 
ated weights 2053 used upon searching the index words 
2050, and sensory pattern weights 2054 used upon searching 
the index words 2050 in correspondence with each other. 
10 The structure of the associative word dictionary 211 will 
be explained below with reference to FIG. 14. 

FIG. 14 shows the structure of the associative word 
dictionary in the embodiment of the present invention. 
j5 The associative word dictionary 211 is composed of 
associative IDs 2110 each of which assigns a unique number 
to a set of associative words for each index word 2111, index 
words 2111 each serving as a start point of association, 
associative words 2112 evoked by the index words 2111, 
20 a . ssociat i vc perspectives 2113 which are relevant to associa- 
tions of the associative words 2112, and association 
strengths 2114 each indicating the strength of association 
between each pair of index word 2111 and associative word 
2112. 

^ The association strength 2114 assumes an absolute value 
ranging from 0 to 10, and its sign indicates direction of 
association of the associative word. More specifically, when 
the association strength is a positive value, it indicates a 
stronger associative relationship (higher bilateral 
30 association) as the association strength value is larger; when 
the association strength is a negative value, it indicates a 
harder associative relationship as the association strength 
value is larger. For example, an associative word "folkcraft 
article" corresponding to an index word "simple" in asso- 
ciative data with the associative ID= 126533 can be associ- 
ated with strength "6", but an associative word "chandelier" 
in associative data with the associative 1D=126536 is hardly 
associated with strength "9" since its association strength is 
a negative value. 

The structure of the search result holding unit 216 will be 
described below with reference to FIG. 15. 

FIG. 15 shows the structure of the search result holding 
unit in the embodiment of the present invention. 

The search result holding unit 216 stores image IDs which 
are found by searches of the image content word search 
processing unit 210 using associative words and the sensory 
pattern search processing unit 214. Referring to FIG. 15, 
reference numeral 2160 denotes a field for storing image IDs 
found by search; 2161, a field for storing the number of 
matched associative words with positive association 
strengths by the image content word search processing unit 
217 using associative words; and 2162, a field for storing a 
list a maximum of 20 associative word IDs 2110 of matched 
associative words in the associative word dictionary 211. 
When the number 2161 of matched associative words is 
zero, the associative ID 2162 is filled with NULL code. 
Reference numeral 2163 denotes a field for storing the 
search matching levels of associative words with respect to 
the image IDs 2160. When the number 2161 of matched 
associative words is zero, the associative matching level 
2163 stores zero. 

Reference numeral 2164 denotes a field for storing the 
number of sensory patterns with highest similarity, which 
are found by search by the sensory pattern search processing 
unit 223; and 2165, a field for storing a list of a maximum 
of 20 sensory pattern IDs of matched sensory patterns. When 
the number 2164 of matched sensory patterns is zero, the 
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sensory pattern ID 2165 is filled with NULLcode. Reference refreshing", no associative words are registered and "001" 

numeral 2166 denotes a field for storing the search matching and "010" are registered as sensory pattern IDs. 

level of a sensory pattern search with respect to the image ID The structure of the above-mentioned sensory pattern 

2160. When the number 2164 of matched sensory patterns is holding unit220 will be described in detail below using FIG. 

zero, the sensory pattern matching level 2166 stores zero, s 5. 

Reference numeral 2167 denotes a field for storing the Referring to FIG. 5, reference numeral 2200-1 denotes an 

matching level (obtained by the search result integration image ID for identifying an image to be registered. The 

processing unit 217) of the image ID 2160 with respect to image IDs use the same ones as those stored in the image 

the search request, which is calculated using the associative holding unit 218, and uniquely define images in this system, 
matching level 2163 and sensory pattern matching level to A ne ld 2200-2 stores sensory pattern IDs. In this 

2166 as parameters. embodiment, since the matching levels between each image 

The structure of the above-mentioned unfolded sensory and 111 sensor y patterns stored in the image feature amount/ 

pattern holding unit 213 will be described below with Se f i0 1 ry J"". 1 , 6 ? corres P°ndence holding unit 223 are 

reference to FIG. 4 calculated, all the sensory pattern IDs are stored. Reference 

D»f»rr,v,„ t„ vnr a t i ■. j is numeral 2200-3 denotes a numerical value indicating the 

Referring to FIG. 4, reference numeral 2130-1 denotes an 15 matching level between each image and seasory pattern The 

image word as an unfolding source from which this sensory matching level assumes a value ranging from 0 to 1- 0 

pattern has evolved upon unfolding, and the same image indicates the image does not match the sensory pattern at all 

word as that in the query word holding unit 202 is stored. In and the matching level becomes higher as it is closer to 1.' 

this embodiment, a character string "refreshing" is stored, For example, the matching level between image with the 

and ends with NULL code. Reference numeral 2130-2 20 image ID=001 and sensory pattern 1 is 0.10, and the 

denotes the number of sensory patterns obtained by unfold- matching level between that image and sensory pattern 2 is 

ing the image word 2130-1 with reference to the image °- 

word/sensory pattern correspondence holding unit 215. For The aforementioned image feature amounts will be 

example, when the contents of the image word/sensory explained in detail below with reference to FIG. 6. In FIG. 

pattern correspondence holding unit 215 are as shown in 25 6 > xl > X2, X3, . . . , Xn represent image features extracted 

FIG. 4, the number of sensory patterns unfolded from the &om on ^ i ma g e > Bl, B2, . . . , Bm represent regions/blocks 

image word "refreshing" is 7. Reference numeral 2130-3 from wnic h image feature amounts are extracted, and xll to 

denotes an address indicating the storage location area of ^l 1 . re P resent image feature amounts extracted from the 

data obtained by actually unfolding the image word individual regions/blocks. That is, feature amounts that 

"refreshing". The address 2130-3 is linked with unfolded 30 P erlam '° physical image features XI to Xn are obtained in 

data 2130-4. umls of regions/blocks. FIG. 16 exemplifies a case wherein 

Referencenumeral2130-4denotesunfoldedda, a actually. ^5^^,™^'*'^ 

unfolded from "refreshing", and sets of associative words j^SSu^ 1 G 1 " pTjcd fa£ S I 

and sensory patterns corresponding to the number 2130-2 of 3J expressed by "representative coloSfZe SSS" 

sensory patterns are stored here. In this embodiment, seven 35 representative colors extracted frorT KjXlod^Bl' 

sets of associative words and sensory patterns are stored. B2, . . . , Bn are CI (Rl Gl Bll C2CR2 G2 m\ 

Reference numeral 2130-5 denotes a sensory pattern ID Cn(Rn, Gn, Bn), and their image feature arLoun'ts are cl to' 

corresponding to the image word "refreshing" and an asso- cn 

^s^"S^n™^ In < ?7 b0 ^ C,,, • - , ™— of the image feature amount holding unit 

S is stored. Reference numeral 2130-6 denotes an asso- 222 will be described below using FIG 17 
ciative word of the image word "refreshine" In this urr- it t. 7° 

embodiment, a character suing "forest" is stored and ends h u V^t Tf™ ° f fcatUre amount 

with NULL code is storea, and ends holding unit in the embodiment of the present invention. 

The structure of the aforementioned image word/sensory , an taSf. fofS !f ""u' 22204 den ° teS 

pattemcorrespondencehoIa^gunit215waibedSed n 45 ZITm f ° .^^S 311 lma 8 e 10 be registered. The 

detail below using FIG 3 ™ff IDs , U f;i h ^ s ? mc ones 35 those st °^ in the image 

Rrfprrin,, vi% * / ° ldlng Mt 218 - Refere nce numeral 2220-2 denotes a block 

™S ~ A ' f C 1 J"" 11 " 31 2150-1 deDOteS aD ° r rcgi0n DUmber from which an W amount is 

image word serving as an unfolding source of this sensory extracted. In this embodiment, Bl, B2, Bm represent 

^?J?^^^^ a T^^" ntHS ^' 50 ,hE "Vnhlo* numbers. Referent numeral 2220-3 
tropical , and the hke are stored, and end with NULLcode. denotes information (in this embodiment, a representative 
Reference numerd 2150-2 denotes an associative word color is used) indicating an image featu/e extract^ from 
unfolded from the image word 2150-1. In this embodiment, each of the regions/blocks Bl, B2 . . . Bm (2220-2) lZ 
associative words "forest", "tableland", "blue sky", and the embodiment exemplifies a case wherein chromatic image 
lite ate stored in correspondence with "refreshing", and 55 features are extracted, and a plurality of pieces of informa- 
these character strings end with NULL code. When no tion CllfRll, Gil, Bll), CnlfRnl qTm! 

S*' is stored k mis fi6ld > NULL code alone indicating colors are stored. Reference numeral 222M 
« I ■' 2" SeDSOr y P attern a P.P Ues to M ima 6 e w °rds denotes image feature amounts of image features extracted 
^refreshing"; no specific associative word has been desig- from the individual regions/blocks. In this embodiment, 
' 60 c ll. • • • , cnl are stored as the image feature amounts of 

Reference numeral 2150-3 denotes a sensory pattern ID image features C11(R11, Gil, Bll), . . . , Cnl(Rnl Gnl 
corresponding to the image word 2150-1 and associative Bnl). ' ' 

word 2150-2. In this embodiment, "005" and "006" are The structure of the image feature amount/sensory pattern 
stored as sensory pattern IDs corresponding to the image correspondence holding unit 223 will be described in detail 
word refreshing' and its associative word "forest" Also 65 below using FIG 7 

sensory patterns for "not refresliing" as an antithetic concept Referring to FIG. 7, reference numeral 2230-1 denotes a 
of refreshing" are stored. In this embodiment, for "not sensory pattern ID, which uniquely identifies aTnJory 
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pattern. Reference numeral 2230-2 denotes an image feature process. Note that the search result integration process will 

amount corresponding to each sensory pattern ID. In this be described in detail later. 

embodiment, a sensory pattern is expressed by a chromatic In step S30 08, i mage files corresponding to image IDs 

image feature amount, and a combination of color compo- stored in the search result holding unit 216 as search results 

nents (values in a color space such as RGB, HVC, or the s obtained in step S3007 are read out from the image holding 

like) corresponding to each sensory pattern ID is stored. In unit 218, and are displayed. Note that this process is a known 

this embodiment, values m the RGB color space are regis- onc which is prcval6nt in ^ Kudl apparatuses of , he 

tered as color components. The RGB values assume integers same type, 

ranging from 0 to 255, and a maximum of m colors , 

correspond to each sensory pattern ID. io . ^^ arcb re q"f t m P ut process ,n step S3001 will be 

t, , , . .. . . described in detail below with reference to FIG. 19. 

loe sensory pattern determination processing unit 224 „_ 

calculates the matching levels between each of image data FIG " 19 K a flow chart sh °wing & e details of the search 

registered in the image holding unit 218 and the respective reqUeSt ^ pr0Cess m the embodiment of the P resent 

sensory patterns using the aforementioned image feature invention. 

amount holding unit 222 and image feature amount/sensory 15 In ste P S2011, a query word serving as a search request is 

pattern correspondence holding unit 223, and registers them mput - Tne 1 uerv word inP ut I s attained by storing a character 

in the sensory pattern holding unit 220 (to be described later code mput at tne KB 15 in the query word holding unit 202 

in step S2207 in FIG. 22). on & c RAM 13. In step S2012, search perspectives that arc 

The processes executed in this embodiment will be relevai " to the query word stored in the query word holding 

described below using FIG 18 M> umt 202 are extracted from the concept ajscrimination 

FIG. 18 is a flow char, showing the processes executed in 2 ° 5 , ^ * ^^^f f 2051 ^ 

the embodiment of the present invention spondmg to index words 2050, which match the query word 

r„ cinm • j . . L ■ , m the query word holding unit 202, are extracted. For 

In step S3001, a processing module that implements the £ when the query % ord is J mfld „ mree 

operation of the search request mput processing unit 201 in M persp Ltives "color tone", "taste", and "gemUl ataospTre" 

FIG. 2 executes a search request input process. Note that the can be obtained auuos-pnere 

search request input process will be explained in detail later. . .... 

If it is determined with reference to the contents of the K . d m step 82013 * a search Perspective or 

search weight holding unit 204 in step S3002 that search Perspectives is or are found. If a search perspective or 

weights are designated, the designated values are stored in P^ecUves is or ire found (YES in step 52013), the flow 

the determined weight holding unit 207. On the other hand advances to S2014. On the other hand, if no search 

if no search weights are designated, index words 2050 are" ^'"S^ foUDd (NO m step S2013 )> the flow advances 

searched for a query word stored in the query word holding '° Step S201 °- 

unit 202 with reference to the concept discrimination die- ^ stcp S2 014, the window for designating the search 

tionary 205 so as to read out a corresponding associated perspective described above with reference to FIG. 8 is 

weight 2053 and sensory pattern weight 2054, and the 3S displayed. In step S2015, the user selects a desired one of the 

readout weights are stored in the determined weight holding search perspectives displayed on the window. The selected 

unit 207. If there is no index word 2050 that is relevant to search perspective is stored in the search perspective holding 

the contents of the query word holding unit 202, a default unit 203 • 

value "5" is stored as both the associated and sensory pattern In step S2016, the user inputs search weights which 

weights in the determined weight holding unit 207. determine the weight balance on a search using associative 

It is checked with reference to the determined weight words and a search using sensory pattern in actual search in 

holding unit 207 in step S3003 if the associated weight is relation to the search process in response to the search 

zero. If the associated weight is zero (YES in step S3003), request. That is, the user operates the slide button 41 on the 
the flow advances to step S3005. On the other hand, if the 45 c °ntrol panel shown in FIG. 9 to designate the weight ratios 

associated weight is not zero (NO in step S3003), the flow on associative words and sensory patterns. When the user 

advances to step S3004. does not designate any search weights, he or she presses the 

In step S3004, a processing module that implements the button 42 in the display area on the control panel shown in 
operations of the associative word unfolding processing unit FIG ' ' 10 designate default values of the search weights. 
208 and image content word search processing unit 210 50 lt is checked in step S2017 if search weights are desig- 
ning associative words in FIG. 2 executes a search process natc d. If search weights are not designated (NO in step 
using associative words. Note that the search process using S2017), i.e., if the default values of the search weights are 
associative words will be described in detail later. designated, the processing ends. On the other hand, if search 

It is checked with reference to the determined weight weights are designated (YES in step S2017), the designated 
holding unit 207 in step S3005 if the sensory pattern weight 55 associative word and sensory pattern weights are stored in 

is zero. If the sensory pattern weight is zero (YES in step search weight holding unit 204 in step S2018, thus 

S3005), the flow advances to step S3007. On the other hand, ending the processing. 

if the sensory pattern weight is not zero (NO in step S3005), The search process using associative words in step S3004 

the flow advances to step S3006. will be described in detail below with reference to the flow 

In step S3006, a processing module that implements the 60 cnart in FIG - 20 

operations of the sensory pattern unfolding processing unit FIG. 20 is a flow chart showing the details of the search 

212 and sensory pattern search processing unit 214 in FIG. process using associative words in the embodiment of the 

2 executes a search process using sensory patterns. Note that present invention. 

the search process "sing sensory patterns will be described In step S2101, associative word data corresponding to 
in detail later. In step S3007, a processing module that 65 index words 2111 in the associative word dictionary 211 

implements the operation of the search result integration that match the query word stored in the query word holdine' 

processing unit 217 executes a search result integration unit 202, are found by search. That is, the associative word 



03/29/2004, EAST Version: 1.4.1 



US 6,493,705 Bl 

17 18 

dictionary 211 is searched for index words 2150-2 (FIG. 3), word ID is added lo the associative word ID 2162, and the 

which match the query word, and registered associative calculated associative matching level is added to the stored 

word data are extracted. If index words that match the query associative matching level 2163 to update its value 

™mZ fT d ' ^aT^ 6 W LT St ° rCd iD ** ™ e P rocess P""^ « step S3006 

unfolded associate word holding unit 209. s ^ , he search result integration proJess in step S3007 will 

In step S2102, the concept discrimination dictionary 205 be described in detail below with reference to FIG 21 

is searched, and if an index word that matches the query FIG. 21 is a flow chart showing the search process using 

word in the query word holding unit 202 is found, a search sensory patterns in step S3006 and the search result inte- 

perspective 2051 corresponding to that index word is gralion process in step S3007 in the embodiment of the 

extracted. The extracted search perspective 2051 is com- i° present invention. 

pared with that stored in the search perspective holdintz unit -rw= ... ■ .. j - 

^S^^*«m&Z^wl£ pro^pr^m Z^tmSKU^ ^ * 

sponding to this index word is extracted. On the other hand tt. • . L , 

if the two search perspectives do not match, data in which (h ™VT mpUtS , as6 " ch rc <i uest for ™«B« a ' 

the query word matches an index word continues to be 15 ^ T re ? UeSt mpUt F"* 8 ?"* ™* 201. The search 

searched for, and if no antithetic concept whose search l^t? °\ v P £ V ° f ^ W ° rds ' !CHdl 

perspective matches the index word is found finally the flow P er ^« cUves . mi The query word input in this 

advances to step S2103 embodiment is an abstract image word that expresses 

,„ «. c ,, n , ,. ... . . ^ . impressions of images such as "refreshing", "swarm", and 

In step S2103 the associate word dictionary 211 is rhe like. In this embodiment, assume thai an image word 
searched for associative words having an index word, which 20 "refreshing" is stored 

matches the antithetic concept found in step S2102. If an c ffM , = cim . . . . t 

index word that matches the antithetic concept is found, their J£?X- * "* lm }> l ™™* d bv 

associative IDs are stored in the unfolded associativ word ^ 

holding unit 209 by appending a status code mdicatuTan Zl 7 , ^ ™ l 202 iS 

antithetic concept thereto indicating an ^ unfolded mto xusmy p ^ ms ^ reference (o ^ ^ 

in o,»„ • .- word/sensory pattern correspondence holding unit 215. In 

In step S2104 associative words are extracted based on this embodiment, the query word holding unit 202 stores the 

tZS Lth 1D ^TIT W ° rd ^ WOTd " refre ^ng" the unfolded associative word 

holding unit 209 and the image content word holding unit holding unit 209 holds associative words "forest" 

219 is searched for image content words that match the "tableland", "blue sky", and the like unfolded from 
associative words. Re search results are stored in the search 30 "refreshing", and the image word is unfoMed int , coT 

ros^re extacterf fr . £ **f°*X- * e — -tive sponding sensory pattern IDs with reference toThe Sage 

n e unit 209 1h f ****** word hold- word/sensory pattern correspondence holding unit 215. For 

^JTa I' ? corr ( 6S P ond ing associative data are example, sensory pattern IDs "005" and "006", correspond- 
ed/acted wifh reference to the associative word dictionary J5 ing to image word "refreshing"-as S ociative word 

211. Next, the association strengths 2114 of the extracted are acquired, and a sensory pattern ID "007" correspondW 

associative data are extracted, and if a status code indicating to image word ^fi^'L^avTwo^SSd 5 

an antithetic concept is appended to a given associative ID is acquired. tawetand 

extracted from the unfolded associative word holding unit r n <e>isi th„ «„. t ' c u a ^ 

ciaUve data is discarded, and the next associative data is Sff L ,„ H e,™/ , ' u t 

checked. In this manner, the obtained association strengths „„,, P f S2144 f* implemented by the sensory 

are set in a work memory ASCF (not shown) on tie S K ^ T ^ lD S,6P S2143 ' ^ 
13. n 45 IDs of images having matching levels larger than zero with 

Th«>„ ,„,„ ■,■ .. . respect to the sensory pattern IDs stored in the unfolded 

Then an associative perspective corresponding to each sensory pattern holding unit 213 are acquired This process 

associative ID is extracted, and is compared with that stored is done for all the sensory patterns stored in (he uZded 

in the search perspective holding unit 203. If the two sensory pattern holding unit 21 ?Note th^t Se Sory 

perspecUves matcM predetermmedvalueaissetinawork 50 pattern search processing unit 214 acquires imaee ?Ds 

ZZhZ v7 (D n^ 0WD) ° n ^ ^ 1X If th6y d ° DOt havi "S -^h-g levels lafger than zero^Tres™ecf o .Se 

m^tch, a value axO.l is set in the work memory VPF on the sensory pattern IDs respectively unfolded fron/the query 

^ n 1J - word and antithetic concept. 

Finally the image content word holding unit 219 is In step S2144, sets of acquired sensory pattern IDs imace 

searched for image content words that match associative 55 IDs, and their matching levels are stored in the search result 

words corresponding to the associative IDs. If an image holding unit 216 

ZT\Z °« JS fOU ,„ d ' ^ ^ [ ? l^l 1 found Steps S2145 to S2149 are implemented by the search 

sTk £ lllZ Ssf 7 r f^' h0ldm ? umt 2U - " 1 " * ™* integration processing unit 217. In step S2145, To 

Z Z nH T , rn m^bed assoaauve words, and sets of search results, i.e., the image content word search 

die found associative ID is set m he associative word ID 60 results using associative words and sensory pattern search 

^^H^^T^Z^^^^^ feSUltS ' Which ™ S,0red iD ^ search resldf bolSnjS 

nredetprm h R k , ^ on . the ^ 13 «V ^ 216, are integrated into one set of search results on the basl 

predetennmed score p based on assooative word matohing of the search weights stored in the determined weS 

« stored as an ^aaUve ma.chmg level in the associative holding unit 207 with reference to those search results 

I g h i ; " 3 ? ,deQhcal 1 una f W h3S 65 ^ tne P atte ™ *arch results include a „ 

been stored, the value of the number 2161 of matched pattern based on the antithetic concept to the query w« 

associative words is incremented by 1, a new associative corresponding image is excluded from the integral 
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Or the sensory pattern matching level of an image including information required for registration are acquired, and are 

a sensory pattern of the antithetic concept may be lowered supplied to the image feature extraction processing unit 221. 

upon integration. In this process, a method of obtaining The image ID is stored in correspondence with the image file 

common elements of two sets of search results in units of name to manage an image, and is acquired by searching data 

associative words (ANDing search results), a method of s in the image holding unit 218 using the image file name, 

calculating integrated matching levels based on the weights Various kinds of image information of the image include 

on the searches, and selecting appropriate search results in Pi* e ' values indicating the width and height of an image, the 

descending order of integrated matching levels, and the like number of bits per pixel, the image size (in units of bytes), 

are available. In this embodiment, the method of calculating address °f tne area where a bitmap image is actually 

the integrated matching levels will be exemplified below to stored ' md me Uke ' for 6xam P le . wh en the file format of this 

Let A be the associative matching level of an image that ^T*/?"^ fi^ ^ ° f 

matches an associative word "foref t" stored in the search ^°T n T ™ rP A l ° / f ^ ^ FTJ 1 *' 

. ■. -mm. .u . u- , . r toey can be acquired by referring to the header fie d. Even 

result holding umt 216, B be the sensory matching level of when , he file format oi \ e ^ * is not the bit fomat 

an image that matches the sensory pattern ID "005" corre- but JFIF or FlashPix, required information can be similarly 

sponding to the associative word 'forest", and wl and w2 15 obtained from the header field of a file. Or the image holding 

(wl+w2-l) be the search weights stored in the determined unit 218 may store such image information, and the image 

weight holding unit 207. Then, the integrated matching level information may be acquired by referring to the image 

is given by: holding unit 218 upon registration. 

lategrated matching level-K.i'A +w 2^ In ste P S2203 ' P h y sical image feature amounts are 

20 extracted by analyzing the image information corresponding 

or to the designated image ID. This process is done by the 

integrated matching leveH>vi vt^-fi 2 )" feature amount extraction processing unit 221. An 

. example of this process is as has already been described 

The integrated matching levels of all sensory patterns of all previously with reference to FIG 16 FIG 16 shows an 

associative words are calculated. When one image ID has is example of the image feature amounts in this embodiment 

matching levels larger than zero with respect to a plurality and representative colors are extracted in units of image' 

of sensory pattern IDs, a plurality of integrated matching regions/blocks. The representative color may be obtained by 

levels are obtained for one image. However, in this case, an using a scheme of analyzing an actual bitmap image using 

image with the highest integrated matching level is adopted various kinds of input image information in units of pixek 

as a search result. This process is done for all images 30 and calculating the average value of color components' 

corresponding to either set of search results larger than zero, (values in a color space such as RGB, HVC or the like) used 

and images whose integrated matching levels are larger than in each region or block, or a color component with the 

a predetermined threshold value X are selected as integrated highest frequency of occurrence as a representative color 

search results. In step S2204, extracted image feature amounts cl to cn 

That is it is checked in step S2146 if the integrated 35 are stored in the image feature amount holding unit 222 in 

matching level of an image to be processed is larger than the correspondence with the image ID of that image This 

teeshold value X If the integrated matching level is equal example is as has already been described previously with 

to or smaller than the threshold value X (NO in step S2146), reference to FIG 17 

.W^Th 6 ^ - t0 f P , S21 , 45 ' °V he L 0th6r haDd ' if In Ste P S2205 ' ^ sensor y P aUera ID * s'°re<> « the image 

integrated matching level K larger than the threshold value 40 feature amount/sensory pattern correspondence holding unit 

t wiSr"^ S2146 )'^ e fl ° w advances t0 ste P S2147 - 223, and image feature amounts corresponding to those 

step S2147, the image ID of the image to be processed is sensory patterns are acquired with reference to the image 

held m the search result holding unit 216 as a search result. feature amount/sensory pattern correspondence holding unit 

I is checked in step S2148 if the next image to be processed 223. This example is as has already been described previ- 

still remains. If the next image still remains (YES in step 45 ously with reference to FIG 7 

S2148), the flow returns to step S2145 On the other hand, In step S2206, the matching level between the acquired 

I t slide J 601 " 1 ( 10 P } ' fl ° W adVaDCCS SCDS ° ry pattern and the ima § c feature amounts wrrespond- 

fnLt q^o ,u . t ■ yp. j l • * mg to the image is calculated. This process is done by the 

In step S2149, the sets of image IDs and their integrated sensory pattern determination processing unit 224 That is 

matchmg levels are stored in the search result holding unit 50 the chromatic image feature amounts corresponding to each 

216, thus ending the processing. 0 f the sensory patterns acquired in step S2205 are compared 

An image registraUon process for registering test images with the image feature amounts extracted in step S2203 to 

wiUbe explained below with reference to FIG. 22. calculate their matching level. In this case, the matching 

MG. 22 is a flow chart showing the image registration levels for all sensory patterns stored in the image feature 

process m the embodiment of the present invention. 55 amount/sensory pattern correspondence holding unit 223 are 

This process is controlled in accordance with a processing calculated. The matching level is calculated using a scheme 

program stored in me DISK 14. such ^ vector commUations> statistic processe ^ or me like 

in step i>ZZUl, the user designates an image to be regis- using cosine measure 

tered The image to be registered is designated from those In step S2207, the matching levels between all the sensory 

stored in an external storage device, an image input device, <so patterns and the image calculated in step S2206 are stored in 

an image database server connected to this image processing the sensory pattern holding unit 220 in correspondence with 

apparatus, or the like (none of them arc shown). In this the image ID of that image. This example is as has already 

embodiment, assume that images serving as test images are been described previously with reference to FIG 5 

stored in advance, and the image to be registered is selected The aforementioned process is done for all images to be 

from toe™- 65 registered. * 

In step S2202, an image ID corresponding to an image file As described above, according to this embodiment, since 

name of the designated image, and various kinds of image both the feature amount of multimedia information itself 
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corresponding to a query word which indicates multimedia 
information wanted, and the content word that describes the 
contents of multimedia information are used as query con- 
ditions on the basis of associative words associated with the 
query word, desired multimedia information wanted can be 5 
accurately extracted. 

For example, in a conventional system, when "sea" is 
obtained as a word which is associated with a query word 
"refreshing", a search result "rough sea" is highly likely to 
be found. However, in this embodiment, such result is to 
excluded when it is integrated with search results using 
sensory patterns obtained from a combination 
"refreshing" — "sea". 

Since multimedia information can be searched based on 
associative words that express the contents pertaining to a 15 
query word indicating desired multimedia information, and 
the feature amount of multimedia information itself is used, 
multimedia information having an inappropriate feature 
amount which cannot meet the query word can be accurately 
extracted. M 

In the above embodiment, image information is used as 
information wanted. As for multimedia information (e.g., 
audio information) other than image information, the present 
invention can be applied by executing information feature 
amount extraction, and corresponding the extracted infor- 25 
mation feature amount to sensory patterns. 

In the above description, the image holding unit 218, 
image content word holding unit 219, and sensory pattern 
holding unit 220 which undergo a search are allocated on the 
DISK 14 that builds a single device, but these building 30 
components may be distributed on different devices, and 
processes may be done on the network via the NIC 19. 

Note that the present invention may be applied to either a 
system constituted by a plurality of devices (e.g., a host 
computer, an interface device, a reader, a printer, and the 35 
like), or an apparatus consisting of a single equipment (e.g., 
a copying machine, a facsimile apparatus, or the like). 

The objects of the present invention are also achieved by 
supplying a storage medium, which records a program code 
of a software program that can implement the functions of oo 
the above-mentioned embodiments to the system or 
apparatus, and reading out and executing the program code 
stored in the storage medium by a computer (or a CPU or 
MPU) of the system or apparatus. 

In this case, the program code itself read out from the 45 
storage medium implements the functions of the above- 
mentioned embodiments, and the storage medium which 
stores the program code constitutes the present invention. 

As the storage medium for supplying the program code, 
for example, a floppy disk, hard disk, optical disk, magneto- 50 
optical disk, CD-ROM, CD-R, magnetic tape, nonvolatile 
memory card, ROM, and the like may be used. 

The functions of the above-mentioned embodiments may 
be implemented not only by executing the readout program 
code by the computer but also by some or all of actual 
processing operations executed by an OS (operating system) 
running on the computer on the basis of an instruction of the 
program code. 

Furthermore, the functions of the above-mentioned 
embodiments may be implemented by some or all of actual 
processing operations executed by a CPU or the like 
arranged in a function extension board or a function exten- 
sion unit, which is inserted in or connected to the computer, 
after the program code read out from the storage medium is 
written in a memory of the extension board or unit 

As many apparently widely different embodiments of the 
present invention can be made without departing from the 
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spirit and scope thereof, it is to be understood that the 
invention is not limited to the specific embodiments thereof 
except as defined in the appended claims. 
What is claimed is: 

1. An information search apparatus for managing a plu- 
rality of kinds of multimedia information, and searching the 
managed multimedia information for desired multimedia 
information, comprising: 
a content word holding unit for storing the multimedia 
information, and content words which verbalize con- 
cepts expressed in the multimedia information in cor- 
respondence with each other; 
an associative word dictionary for storing the content 
words and associative words which are associated with 
the content words in correspondence with each other; 
input means for inputting a query word; 
first search means for acquiring an associative word 
corresponding to the query word input by said input 
means from said associative word dictionary, and 
searching multimedia information on the basis of the 
acquired associative word; 
extraction means for extracting a feature amount corre- 
sponding to the query word input by said input means; 
second search means for searching multimedia informa- 
tion on the basis of the feature amount extracted by said 
extraction means; and 
integration means for integrating search results obtained 
by said first and second search means. 

2. The apparatus according to claim 1, wherein said input 
means can also input a search perspective. 

3. The apparatus according to claim 2, further comprising: 
a concept discrimination dictionary for storing index 

words and antithetic concepts corresponding to the 
index words in correspondence with each other, and 
wherein said first search means acquires an index word 
and antithetic concept corresponding to the query word 
from said concept discrimination dictionary on the 
basis of the query word and search perspective input by 
said input means, and acquires an associative word 
corresponding to the query word from said associative 
word dictionary on the basis of the acquired index word 
and antithetic concept. 

4. The apparatus according to claim 1, further comprising: 
a holding unit for storing associative words and sensory 

patterns in correspondence with each other, and 
wherein said extraction means acquires a sensory pattern 
corresponding to the associative word, which corre- 
sponds to the query word, from said holding unit, and 
extracts a feature amount of the acquired sensory 
pattern as the feature amount corresponding to the 
query word. 

5. The apparatus according to claim 1, wherein the 
multimedia information is image information. 

6. The apparatus according to claim 5, wherein the feature 
amount includes at least one of color scheme information, 
composition information, and shape information contained 
in the image information. 

7. The apparatus according to claim 1, wherein said 
integration means integrates the search results obtained by 
said first and second search means using first matching 
levels obtained from the search results of said first search 
means, and second matching levels obtained from the search 
results of said second search means. 

8. An information search method for managing a plurality 
of kinds of multimedia information, and searching the 
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managed multimedia information for desired multimedia 

information, comprising: 

the storage step of storing on a storage medium a content 
word holding unit for storing the multimedia 
information, and content words which verbalize con- 
cepts expressed in the multimedia information in cor- 
respondence with each other, and an associative word 
dictionary for storing the content words and associative 
words which are associated with the content words in 
correspondence with each other; 
the input step of inputting a query word; 
the first search step of acquiring an associative word 
corresponding to the query word input in the input step 
from said associative word dictionary, and searching 
multimedia information on the basis of the acquired 
associative word; 

the extraction step of extracting a feature amount corre- 
sponding to the query word input in the input step; 

the second search step of searching multimedia informa- ^ 
tion on the basis of the feature amount extracted in the 
extraction step; and 

the integration step of integrating search results obtained 
in the first and second search steps. 

9. The method according to claim 8, wherein the input 25 
step includes the step of allowing to also input a search 
perspective. 

10. The method according to claim 9, wherein the storage 
step also includes the step of storing on said storage medium 

a concept discrimination dictionary for storing index words 30 
and antithetic concepts corresponding to the index words in 
correspondence with each other, and 

the first search step includes the step of acquiring an index 
word and antithetic concept corresponding to the query 
word from said concept discrimination dictionary on 35 
the basis of the query word and search perspective 
input in the input step, and acquiring an associative 
word corresponding to the query word from said asso- 
ciative word dictionary on the basis of the acquired 
index word and antithetic concept. 40 

11. The method according to claim 8, wherein the storage 
step also includes the step of storing on said storage medium 
a holding unit for storing associative words and sensory 
patterns in correspondence with each other, and 

the extraction step includes the step of acquiring a sensory 45 
pattern corresponding to the associative word, which 
corresponds to the query word, from said holding unit, 
and extracting a feature amount of the acquired sensory 
pattern as the feature amount corresponding to the 
query word. 50 

12. The method according to claim 8, wherein the mul- 
timedia information is image information. 

13. The method according to claim 12, wherein the feature 
amount includes at least one of color scheme information, 
composition information, and shape information contained 55 
in the image information. 

14. The method according to claim 8, wherein the inte- 
gration step includes the step of integrating the search results 
obtained in the first and second search steps using first 
matching levels obtained from the search results in the first 60 
search step, and second matching levels obtained from the 
search results in the second search step. 

15. A computer readable memory for storing a program 
code of an information search process for managing a 
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a program code of the storage step of storing on a storage 
medium a content word holding unit for storing the 
multimedia information, and content words which ver- 
balize concepts expressed in the multimedia informa- 
tion in correspondence with each other, and an asso- 
ciative word dictionary for storing the content words 
and associative words which are associated with the 
content words in correspondence with each other; 

a program code of the input step of inputting a query 
word; 

a program code of the first search step of acquiring an 
associative word corresponding to the query word input 
in the input step from said associative word dictionary, 
and searching multimedia information on the basis of 
the acquired associative word; 

a program code of the extraction step of extracting a 
feature amount corresponding to the query word input 
in the input step; 

a program code of the second search step of searching 
multimedia information on the basis of the feature 
amount extracted in the extraction step; and 

a program code of the integration step of integrating 
search results obtained in the first and second search 
steps. 

16. An information search apparatus for managing a 
plurality of kinds of multimedia information, and searching 
the managed multimedia information for desired multimedia 
information, comprising: 
a content word holding unit for storing the multimedia 
information, and content words which verbalize con- 
cepts expressed in the multimedia information in cor- 
respondence with each other; 
an associative word dictionary for storing the content 
words and associative words which are associated with 
the content words in correspondence with each other; 
input means for inputting a query word; 
a concept discrimination dictionary for storing index 
words corresponding to the query word and search 
perspectives pertaining to the index words in corre- 
spondence with each other; 
display means for extracting search perspectives pertain- 
ing to an index word corresponding to the query word 
input by said input means from said concept discrimi- 
nation dictionary, and displaying the extracted search 
perspectives; 

designation means for designating a desired one of the 
search perspectives displayed by said display means; 

first search means for acquiring an associative word 
corresponding to the query word input by said input 
means from said associative word dictionary, and 
searching multimedia information on the basis of the 
acquired associative word; 

second search means for extracting a feature amount 
corresponding to the query word input by said input 
means, and searching multimedia information on the 
basis of the extracted feature amount; and 

integration means for integrating search results obtained 
by said first and second search means on the basis of the 
search perspective designated by said designation 
means. 



17. The apparatus according to claim 16, wherein said 

- ; j -™ »«' ■""■■6i"6 a concept discrimination dictionary also stores antithetic con- 

plurahty of kinds of multimedia information, and searching 65 cepts corresponding to the index words and 
the managed multimedia information for desired multimedia wherein said first search means acquires an index word 
information, comprising: and antithetic concept corresponding to the query word 
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from said concept discrimination dictionary on the 
basis of the query word and search perspective input by 
said input means, and acquires an associative word 
corresponding to the query word from said associative 
word dictionary on the basis of the acquired index word s 
and antithetic concept. 

18. The apparatus according to claim 17, further com- 
prising: 

a holding unit for storing associative words and sensory 
patterns in correspondence with each other, and 

wherein said extraction means acquires a sensory pattern 
corresponding to the associative word, which corre- 
sponds to the query word, from said holding unit, and 
extracts a feature amount of the acquired sensory 
pattern as the feature amount corresponding to the 15 
query word. 

19. The apparatus according to claim 17, wherein the 
multimedia information is image information. 

20. The apparatus according to claim 19, wherein the 
feature amount includes at least one of color scheme 20 
information, composition information, and shape informa- 
tion contained in the image information. 

21. An information search method for managing a plu- 
rality of kinds of multimedia information, and searching the 
managed multimedia information for desired multimedia 25 
information, comprising: 

the input step of inputting a query word; 

the storage step of storing on a storage medium a content 
word holding unit for storing the multimedia 30 
information, and content words which verbalize con- 
cepts expressed in the multimedia information in cor- 
respondence with each other, an associative word dic- 
tionary for storing the content words and associative 
words which are associated with the content words in 35 
correspondence with each other, and a concept dis- 
crimination dictionary for storing index words corre- 
sponding to the query word and search perspectives 
pertaining to the index words in correspondence with 
each other: 

40 

the display step of extracting search perspectives pertain- 
ing to an index word corresponding to the query word 
input in the input step from said concept discrimination 
dictionary, and displaying the extracted search perspec- 
tives; 45 

the designation step of designating a desired one of the 
search perspectives displayed in the display step; 

the first search step of acquiring an associative word 
corresponding to the query word input in the input step 
from said associative word dictionary, and searching 50 
multimedia information on the basis of the acquired 
associative word; 

the second search step of extracting a feature amount 
corresponding to the query word input in the input step, 
and searching multimedia information on the basis of 55 
the extracted feature amount; and 

the integration step of integrating search results obtained 
in the first and second search steps on the basis of the 
search perspective designated in the designation step. 60 

22. The method according to claim 21, wherein said 
concept discrimination dictionary also stores antithetic con- 
cepts corresponding to the index words, and 

the first search step includes the step of acquiring an index 
word and antithetic concept corresponding to the query 
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word from said concept discrimination dictionary on 
the basis of the query word and search perspective 
input in the input step, and acquiring an associative 
word corresponding to the query word from said asso- 
ciative word dictionary on the basis of the acquired 
index word and antithetic concept. 

23. The method according to claim 21, wherein the 
storage step also includes the step of storing on said storage 
medium a holding unit for storing associative words and 
sensory patterns in correspondence with each other, and 

the second search step includes the step of acquiring a 
sensory pattern corresponding to the associative word, 
which corresponds to the query word, from said hold- 
ing unit, and extracting a feature amount of the 
acquired sensory pattern as the feature amount corre- 
sponding to the query word. 

24. The method according to claim 21, wherein the 
multimedia information is image information. 

25. The method according to claim 24, wherein the feature 
amount includes at least one of color scheme information, 
composition information, and shape information contained 
in the image information. 

26. A computer readable memory for storing a program 
code of an information search process for managing a 
plurality of kinds of multimedia information, and searching 
the managed multimedia information for desired multimedia 
information, comprising: 

a program code of the input step of inputting a query 
word; 

a program code of the storage step of storing on a storage 
medium a content word holding unit for storing the 
multimedia information, and content words which ver- 
balize concepts expressed in the multimedia informa- 
tion in correspondence with each other, an associative 
word dictionary for storing the content words and 
associative words which are associated with the content 
words in correspondence with each other, and a concept 
discrimination dictionary for storing index words cor- 
responding to the query word and search perspectives 
pertaining to the index words in correspondence with 
each other; 

a program code of the display step of extracting search 
perspectives pertaining to an index word corresponding 
to the query word input in the input step from said 
concept discrimination dictionary, and displaying the 
extracted search perspectives; 

a program code of the designation step of designating a 
desired one of the search perspectives displayed in the 
display step; 

a program code of the first search step of acquiring an 
associative word corresponding to the query word input 
in the input step from said associative word dictionary, 
and searching multimedia information on the basis of 
the acquired associative word; 

a program code of the second search step of extracting a 
feature amount corresponding to the query word input 
in the input step, and searching multimedia information 
on the basis of the extracted feature amount; and 

a program code of the integration step of integrating 
search results obtained in the first and second search 
steps on the basis of the search perspective designated 
in the designation step. 
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item that has been included in the cache has changed, it 
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item if it is still included in the cache. In a preferred 
embodiment, the data source is a database system and 
triggers in the database system are used to generate update 
messages. In a preferred embodiment, the data access layer 
determines whether a data item required by an application 
program is in the cache. If it is, the data access layer obtains 
the item from the cache; otherwise, it obtains the item from 
the data source. The queryable cache includes a miss table 
that accelerates the determination of whether a data item is 
in the cache. The miss table is made up of miss table entries 
that relate the status of a data item to the query used to access 
the data item. There are three statuses: miss, indicating that 
the item is not in the cache, hit, indicating that it is, and 
unknown, indicating that it is not known whether the item is 
in the cache. When an item is referenced, the query used to 
access it is presented to the table. If the entry for the query 
has the status miss, the data access layer obtains the item 
from the data source instead of attempting to obtain it from 
the cache. If the entry has the status unknown, the data 
access layer attempts to obtain it from the cache and the miss 
table entry for the item is updated in accordance with the 
result. When a copy of an item is added to the cache, miss 
table entries with the status miss are set to indicate unknown. 

16 Claims, 8 Drawing Sheets 
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DYNAMIC CACHES WITH MISS TABLES Data access layer 112 is generally provided by the manu- 

facturer of database server 115. It takes queries written in 
CROSS-REFERENCES TO RELATED standard forms such as OLE-DB, ODBC, or JOBC, converts 

APPLICATIONS the queries into the form required by database server 115, 

The present patent application is a continuation-in-part of 5 "? pla . C " ,* e J™™ 5 m messa ^f f form re< l uired b V 
U.S. Ser. No. 09/294,656, Cusson, et al., Web servers with nC T U3 , Databasc . server 115 tben executes the query 
queryable dynamic caches, fled Apr. 19, 1999, and. claims ^ ^ ^ " 3 l ° daU aCCCSS layer 
priority from U.S. Provisional Application No. 60/168 ,589, J** ^sults mto re ^"=d standard form 
Cusson et al., Improving the performance of dynamic data ^ * em , '° Web a PP bcab ° n "1. wb * b in turn puts 
caches by collecting multi-user query miss statistics, filed 10 £L ^ mt0 me P<,°P er f ° rmal f ° r »™L program 109. 
Dec. 2, 1999. The patent application contains the entire pr0gr ™ ^ ,n f result in makin S ,he 
Detailed Description and drawing of U.S. Ser. No. 09/294, HTML page 106 t0 be returned to browser 103 - 
656. The new material begins with FIG.7 and the section of ^ mav b . e * en trom ^ above description, a response to 
the Detailed Description entitled Making cache misses a URL specifying a page whose construction involves data- 
faster. " 15 base server 115 requires four network hops: one on Internet 

105 from browser 103 to Web server 107, one on network 

BACKGROUND OF THE INVENTION 113 from server 107 1° server 115, one on network 113 from 

1 c- u t .u t .■ server 115 to server 107, and one on Internet 105 from server 

1. Field of the Invention 107 to browser 103. If more than one query is required for 
The invention concerns caching of data in networks 20 an HTML page, there will be a round trip on network 113 for 

generally and more specifically concerns the caching of each query. 

queryable data in network servers. " ' Moreover, as shown a. 117, a typical Web transacUon is 

2. Description of the Prior Art a series of such responses: the first HTML page includes the 
Once computers were coupled to communications ^ URL for a next HTML page, and so forth. The transaction 

networks, remote access to data became far cheaper and shown at 117 begins with a request for an HTML page that 
easier than ever before. Remote access remained the domain ^ a form which the user will fill out to make the query; data 
of specialists, however, since the available user interfaces base server 115 provides the information for the HTML 
for remote access were hard to learn and hard to use. The P a 8 e - When that page is returned, the user fills out the form 
advent of World Wide Web protocols on the Internet have 30 m ? when he or she is finished, the browser returns a URL 
finally made remote access to data available to everyone. A witn me 1 uer y fr° m the form to server 107, which then deals 
high school student sitting at home can now obtain infor- ^th toe query as described above and returns the result in 
mation about Karlsruhe, Germany from that city's Web site another HTML page. That page permits the user to order, 
and a lawyer sitting in his or her office can use a computer an d wben 'he user orders, the result is another query to' 
manufacturer's Web site to determine what features his or 3S database server 115, this time, one which updates the records 
her new PC ought to have and then configure, order, and pay involved in the transaction. 

for the PC. Not on i y do Web transactions made as shown in FIG. 1 

A consequence of the new ease of remote access and the involve many network hops, they also place a tremendous 
new possibilities it offers for information services and burden on data base server 115. For example, if data base 
commerce has been an enormous increase in the amount of 40 server 115 belongs to a merchant who sells goods on the 
remote access. This has in turn lead to enormous new Web and the merchant is having a special, many of the- 
burdens on the services that provide remote access and the ;transactions will require exactly the same sequence of 
resulting performance problems are part of the reason why HTML pages and will execute exactly the same queries, but 
the World Wide Web has become the World Wide Wait. because system 101 deals with each request from a web 
FIG. 1 shows one of the causes, of the performance 45 browser individually, each query must be individually 
problems. At 101, there is shown the components of the executed by database server 115. 

system which make it possible for a user at his or her PC to The problems of system 101 are not new to the designers 
access an information source via the World Wide Web. Web of computer systems. There are many situations in a corn- 
browser 103 is a PC which is running Web browser software. puter system where a component of the system needs faster 
The Web browser software outputs a universal resource 50 access to data from a given source, and when these situations 
locator (URL) 104 which specifies the location of a page of occur, the performance of the system can be improved if 
information in HTML format in the World Wide Web and copies of data that is frequently used by the component are 
displays HTML pages to the user. The URL may have kept at a location in the system to which the component has 
associated with it a message containing data to be processed faster access than it has to the source of the data. When such 
atthe site of the URL as part of the process of obtaining the 55 copies exist, the location at which the copies are kept is 
HTML page. For example, if the information is contained in termed a cache and the data is said to be cached in the 
a database, the message may specify a query on the data system. 

^r'S^ lbOf ^ qUer5 ;rL 0Uld berctUmed 38 part Cachin S » ™* at ma °y kvels in system 101 For 
of the HTMLpage. Interne, 105 routes the URL 104 and its example, browser 103 keeps a cache of previously-disp uZ 
associated message to the location specified by the URL, 60 HTML pages, so that, it can provide a previously-displaved 
namely Web server lOJ^ere, HTML program 109 in Web HTML page to the user without making ~s, She 
server 107 makes the HTML page 106 specified by the URL page across Internet 105. Web server 107 similarly ma keen 
and returns it to Web browser 103 If the message specifies a cache of frequently-requested HTML p^SSZ 

nrolram lOS l^T " ^ "™ L ^ retu "> * e * '«* -tead o § f constructing" 

program 109 hands the message off to Web application 65 Database server 115, finally, may keep a cache of the 

program Ul, which translates the message into a query in information needed to answer frequent^-mad? queries 2 

the form required by data access layer 112. that it can retum a result more ^ ^ ^ J*^*" 



03/29/2004, EAST Version: 1.4.1 



US 6,487,641 Bl 



from scratch. In system 101, the mast effective use of 
caching is in Web server 107, since data that is cached there 
is still accessible to all users of internet 105, while the 
overhead of the hops on data access 113 is avoided. 

Any system which includes caches must deal with two 
problems: maintaining consistency between the data in the 
cache and the data in the data source and choosing which 
data to cache. In system 101, the first problem is solved in 
the simplest way possible: it is the responsibility of the 
component using the data to determine when it needs a new 
copy of the data from the data source. Thus, in browser 103, 
the user will see a cached copy of a previously-viewed 
HTML page unless the user specifically clicks on his brows- 
er's "reload" button. Similarly, it is up to HTML program 



queryable form, in which the cached data is dependably 
updated when the data in the source changes, and in which 
selection of data from a source for caching is based on 
something other than the mere fact that a URL received from 
a web browser referenced the data, and thus provides a 
solution to the foregoing problems. The cache thus solves 
many of the problems of prior-art caches in network envi- 
ronments. 

A remaining problem, however, is that the only way that 
Web server 107 can determine whether a query can be 
performed on the cache instead of on database server 115 is 
by doing the query on the cache and if a miss results, doing 
the query on database server 115. A query that goes to 
database server 115 as a result of a cache miss is thus 



„. „ „ „ r limn, yivgiom uiuauase server us as a result or a cacne miss is thus 

109 to determine when it needs to redo the query that 15 substantially slower than one that goes directly to database 



— n — - J - 

provided the results kept in a cached HTML page. The 
second problem is also simply solved: when a new page is 
viewed or provided, it replaces the least recently-used 
cached page. 

Database systems such as the Oracle8™ server, manu- 
factured by Oracle Corporation and described in Leverenz, 
et a]., Oracle8 Server Concepts, release 8.0, Oracle 
Corporation, Redwood City, Calif., 1998., move a copy of a 
database closer to its users by replicating the original 
database at a location closer to the user. The replicated data 
base may replicate the entire original or only a part of it. 
Partial replications of a database are termed table snapshots. 
Such table snapshots are read-only. The user of the partial 
replication determines what part of the original database is 
in the table snapshot. Consistency with the original database 
is maintained by snapshot refreshes that are made at times 
that are determined by the user of the table snapshot. In a 
snapshot refresh, the table snapshot is updated to reflect a 
more recent state of the portion of the origina] database 
contained in the snapshot. For details, see pages 30-5 
through 30-11 of the Leverenz reference. 

There are many applications for which the solution of 
letting the component that is doing the caching decide when 
it needs a new page causes problems. For example, when the 
information in a data source is important or is changing 
rapidly (for example, stock prices), good service to the user 
requires that the information in the caches closely tracks the 
information in the data source. Similarly, there are many 
situations where caching all data that has been requested 
causes problems. For instance, in a cache run according to 
least recently-used principles, any HTML page that is pro- 
duced by HTML program 109 or received in browser 103 is 
cached and once cached, stays in the cache and takes up 
space that could be used for other HTML pages until it 
attains least recently-used status. 

When Web server 107 includes a Web application 111 
involving a database server 115, there is still another prob- 
lem with caching in web server 107: since the data is cached 



server 115, and when there is a substantial number of cache 
misses, the result may be a substantial degradation of the 
overall performance of Web server 107 with a cache. It is an 
object of the present invention to make a query to database 
20 server 115 that results from a cache miss substantially as fast 
as a query that goes directly to database server 115. 

SUMMARY OF THE INVENTION 

The object is achieved by adding a miss table to a cache 
that contains copies of remotely-stored items. The query that 
is applied to the cache is in effect a specifier for the item that 
will be returned by the query. There may or may not be a 
copy of the item in the cache. If there is not, the remotely- 
stored item must be fetched. The miss table relates the 
specifier for the item to a status indicator that indicates at 
least whether the item is present in the cache. A dispatcher 
receives the specifier for the item and presents it to the miss 
table; if the miss table indicates that there is no copy of the 
item in the cache, the dispatcher uses the item specifier to 
fetch the remotely-stored data item. 

The status indicator may further indicate that it is 
unknown whether there is a copy of the item in the cache. 
When the status indicator so indicates, the cache responds to 
the remote item specifier and provides an indication whether 
there is a copy of the item in the cache. Amiss table manager 
for the miss table responds to the indication by updating the 
miss table in accordance with the indication. The cache 
further provides the miss table manager with a change event 
notification to the miss table manager when the contents of 
the cache have changed and the miss table manager responds 
thereto by setting the status for at least those items for which 
the status in the miss table indicates that there is no copy and 
which are affected by the change to unknown. 

In an preferred embodiment, the miss table is employed in 
a network server that includes a cache. The cache may 
contain a copy of a rowset from a remote location and 
responds to a rowset specifier specifying the remote location 
by returning the rowset when there is a copy in the cache. 



30 
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in the form of HTML pages, it is not in queryable form, that 5S The miss table relates the rowset specifier to a status 



60 



is, a cached HTML page may contain data from which 
another query received from Web browser 103 could be 
answered, but because the data is contained in an HTML 
page instead of a database table, it is not in a form to which 
a query can be applied. Thus, even though the data is in 
server 107, server 107 must make the query, with the 
accompanying burden on data base server 115 and delays 
across network 113, and the HTML page containing the 
result of the query must be separately cached in server 107. 

U.S. Ser. No. 09/294,656, Cusson, et al., Web servers with 65 
queryable dynamic caches, describes a web server 107 that 
has a cache in which cached data is to the extent possible in 



indicator as described above. If the miss table indicates to 
the network server that there is no copy of the rowset in the 
cache, the network server fetches the rowset from the remote 
location. 

Other objects and advantages will be apparent to those 
skilled in the arts to which the invention pertains upon 
perusal of the following Detailed Description and drawing, 
wherein: 

BRIEF DESCRIPTION OF THE DRAWING 

FIG. 1 is an example of a prior-art system for performing 
queries via the World Wide Web; 
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FIG. 2 is a high-level block diagram of a system of the Continuing in more detail with queryable cache 219, the 

invention; data cached in queryable cache 219 is contained in cache 

FIG. 3 is a detailed block diagram of details of an database 236, which, like any database, contains data, in this 

implementation of server 203; " case, copies of datasets (database tables) from source data- 

FIG. 4 is a detailed block diagram of details of an 5 bas c241 that are cached in queryable cache 219,.and a query 

implementation of source database server 237- engme ( QE 221 )> whicb ^ queries on the datasets in 

FIG. 5 is a detail of cache database description 305; " Ched d * t * 223 J b * P«*»> ° f liable cache 219 which 

urn n ■ a u _ t ,u r . receives quenes from data access layer 253 is data access 

^FIG. 6 is a flowchart of the operation of query dispatcher mterface 2 i 2 . Data access interface 212 has two functions: 

el- , ■ . , . 10 It determines whether the query can be executed on 

FIG. 7 is an overview of a miss table as used in query cacned data 2 23 required to execute query 215 and 

analyzer 313 of queryable cache 302; and g6nerat6S mjss signal 2U tf it does J| 

FIG. 8 js a flowchart of the operation of the miss table. [f cached data 223 does contain the data, it puts query 215 

Reference numbers in the drawing have three or more into the proper form for cache database 236. 
digits: the two right-hand digits are reference numbers in the is Data access interface 212 makes the determination 

drawing indicated by the remaining digits. Thus, an item whether the query can be executed by analyzing the query to 

with the reference number 203 first appears as item 203 in determine the query's context, that is, what datasets are 

FIG - 2 - required to execute the query and then consulting a descrip- 

DETAILED DESCRIPTION t * 0n °^ cacned data 223 to determine whether these datasets 

•n. * ii • rw •, j ™ 20 are P reseQt in cached data 223. The datasets are specified in 

The following Detailed Description will begm with a the query by means of dataset identifiers, and consequendy 

conceptual overview of the invention and will then describe the context is for practical purposes a list of the identifiers 

a presently-preferred embodiment of the invention. for the required data sets. The description 223 of course 

Overview of the Invention- FIG 2 includes the dataset identifiers for the cached data sets. If the 

KG. 2 shows a system 201 for retrieving information via " ZZtJT^Jl^^t jTf "'^ V? 

a network which includes one or more network servers ™ £T, q ^ ^ fonn It V aud to 

Sit ny a f queryable cache sss 

the contents of cache 223 are determined by TZ^ of 236 "ZZZ £ ' * " ^ 

what queries will most probably be made by u^reof s^er ^ ***Jfj* qU ^ Me ' thlt *• * a dataset is 

203(0 m the immediate foture. Server 203 is a Web st^e ™ ^ ^.P' V*?** ^ 219 Can 

107, and thus has an HTML component 109, a Web Tpp£ InZZLl!^ ? TV TTu^' ^ ^ ^ 
cation component 111, and a data access component 253 « / t ??? tl ^ ^nbcd by a query. For example, 

w^is^Bfoaofiu.cces.cx.^.umS.S rt M , ' DClUdeS * ^"^.U of the kinds 

been modified to work with queryable cache 219. slrS 203 °°7^ Web commerce and 

could, however, communicate with its users by anTother v ± hI °/ ^ ll™ u^o ° f is 

kind of network protocol. Server 203 further commutes f L^T 219 ^ be aMe t0 handle a 

with source data base server 237 by means ofTtwTk m, 40 T 7vanabt 'in S " ' ^ ° f ^ ° f ** 

^mGTsZsZlZT^t^^ ^/TT Cached data 223 kept ™*«* — datab - 

Mb. 2 shows one server 203, server 203(0, in detail. As 241 by means of update transmitter 243 in source database 

before, Web application 111 provides a query in a standard server 237 and update receiver 210 inrn^wT^S 219 

form to data access 253. Here, however, data access 253 has Whenever a change ocJs in sou^ daTabase 241 in a 

^'^6ffifc^2rf ^• ronl ^', Ca ^ e Updat£ tranSmiUer 243 8» erates * cache update qutry 

data base 236 that has a copy 223 of a portion of the data m (CUDQ) 234 specifying the change and sends CUDO 234 

source database 241. When data access 253 receives a query tia network 113 to each of se^rs 2^3(0 „) Update 

from web apphcation 111, it first presents the query to receiver 210 receives CUDQ Tfrom network 1W and 
Sel^ f ' -fi H ^ 31 ° 215 - tf C ^ Cd data 223 50 det ™ s from da * set description mauled by d1 

retul J?,?.; a p^?lT^ e H h ^ q T« yable CaChC 219 212 whetber ^ dataset * i> cached daS if it T 

S S ^If cachL dtta 2 2 \T SS f ? ^ * pU,S Cache "P date ^ into the proper fo™ 251 

application 111. If cached data 223 does not include the data cache database 236 and provides it to cache refresher 249 

£S r^^TVl^^ T Which *°* Juery 2 ll on c"cht St 2?6/ 
signal (M) 216 to data access 253, which then makes the ss Data set manager (DSM) 213 decides generally what 

query v» network 113 to source database server 237 and copies of datasets from souri database seLr K to be 

Sol ^ "T^ " Weba PP, licat i on 1U - deluded in cache database 236. Tne information that DSM 

^KmZnT,: : m,SS SlgDal aPPeaiS 35 213 ™ S 10 tMs d <t"mination is contained in que^ 

r^Sm 226 aPPCarS 35 miSS mfonnat i° n 208 - Query information 208 may be any Ifol 
. 1 J , ' , 60 mation available to server 203(0 which can be used to 

t is important to note here that because the interactions predict what datasets of source database 241 will most 

with queryable cache 219 and with source database server probably be queried in the near future. For example if a 

237 are both performed by data access layer 253, the company engaged in Web commerce is having a 1-day sale 

^h C6 "l^M* 219 ts completely transparent on certain items for which there are datasets in source 
-Web application 111. Thati is, a Web application program 65 database 241, query information 208 may indicate the 

HI that runs .on Web server 107 will run without changes on datasets for the items and the time of the 1-day sale Using 

server 203(0- that information, DSM 213 can obtain the datasets from 
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source database 241 and cache them in cache database 236 other embodiments, there may be more than one cache 
before the beginning of the sale and remove them from database 236 in queryable cache 219, with different cache 

cache database 236 after the end of the sale. databases being used for different kinds of queryable data. 

Another kind of query information 208 is a query log, a Again, DA 212, DSM 213, and update receiver 210 will 

time-stamped log of the queries received from data access s perform the necessary translations, 
layer 253; if the log shows a sharp increase in the occurrence 

of queries for a given dataset, DSM 213 should cache the Details of a Preferred Embodiment of a Data 

datasets for that query in cache 219 if they are not there Access Layer and a Queryable Cache FIGS. 3, 5, 

already. Conversely, if the log shows a sharp decrease in the and 6 

occurrence of such queries, DSM 213 should consider to - . , 

removing these datasets from queryable cache 219 When j WS 3 preferred embodiment 301 of data access 

DSM 213 determines that a dataset should be added to 349 and queryable cache 302. Corresponding components of 

queryable cache 219, it sends a new data query (NDQ) 218 U J 2 3 have me same names Cache database 347 in 

via network 113 to source data base 241 to obtain the new embodiment 301 is an Oracle8 Server, which is described in 

data and when DSM 213 has the response (NDR 220) it is detad m Leverenz ' e ' al., OracleS Server Concepts, release 

sends a delete query to query engine 221 indicating the data 8 0 ' 0racle Cor P°rau'on, Redwood City, Calif., 1998. In 

to be deleted in cached data 223 to make way for the new Preferred embodiment 301, Web application 111 uses global 

data and then sends a cache update query 251 to cache Xt ldentifiers m queries. The Web applications 111 in 

refresher 249 to update the cache. ^ of the servers 203 use the same set of global data set 

Data set manager 213 and query information 208 may 20 ! dentifiers - A ache ba se 347 in a given server 203 has 

also be implemented in part in source data base server 237 ltS own . set of local data se t identifiers for the data sets 

or anywhere where information about the probability of cached m cache data base 347. In preferred embodiment 

future queries may be obtained. When implemented in 301, then ' one may speak of & obai queries and query 

source data base server 237, the query log would log each contexts 11)31 use global data set identifiers and local queries 

query 231 to source database 241 and at least the portion of 25 and qUety contexts mat use local data set identifiers. In the 

data set manager 213 which reads the query log to determine P re ferred embodiment, query analyzer 313 uses cached data 

what new data needs to be cached would be in source e desc ription 305 to translate global query contexts into 

database server 237; when it determined that new data local query coatexts - 

needed to be cached, it would send an update query with the Da,a access l a yer 349 includes a new component, query 

new data to each of the servers 203. The component of DSM 30 dispatcher 351, which is the interface between data access 

213 that determines what is to be removed could also be in laver 349 md queryable cache 302. FIG. 6 is a flowchart 601 

source database server 237, in which case, all queryable of tne operation of query dispatcher 351 in a preferred 

caches 219 would contain the same data in cached data 223, embodiment. Reference numbers in parentheses refer to 

or that component could be in each server 203(i), with the' elements of the flowchart. When data access layer 349 is 

component making decisions concerning what data to 35 P re P a ring to query source database 241, it provides the 

remove to accommodate the new data based on the present global context for the query to query dispatcher 351 (605), 

situation in server 203(t). In such an arrangement, there can which in turn provides global context 318 (FIG. 3) to query 

be a local query log in each server 203 in addition to the analyzer 313 (607). Query analyzer 313 determines whether 

global query log in source database server 241. Such an lne da tasets identified by the global context are cached in 

arrangement would permit different servers 203 to have io cac[le database 347; if they are not, query analyzer 313 

different-sized caches 223; it would also permit different. reports a miss 319 to query dispatcher 351 (609), which 

servers 203 to take local variations in the queries they are indicates to data access layer 349 that it is to place the global 

receiving into account in determining what data to remove q uecv on network 113. 

from cache 219. One way such variations might occur is if If the datasets identified by the global context are cached 
system 201 were set up so that different servers 203 pref- 45 in cache database 347, query analyzer 313 indicates that fact 

erentially received queries from users in different geographi- to query dispatcher 351 and also provides query dispatcher 

^locations. 351 with local context 316 for the datasets in cache database 

FIG. 2 shows only a single source database server 237; 347 (615). Query dispatcher 351 then provides the local 
there may of course be more than one; moreover, source context to data access layer 349, which uses the local context 
database server 237 need not be a classical database system. 50 to make a local query 317 corresponding to the global query 
Server 203(j) can be set up to be used with data sources then uses the local query to obtain local result 320 from 
containing any kind of queryable data, where queryable is cache database 347. It should be noted here that the opera- 
defined as having a form which can be represented as a set lions involved in the translation from the global query to the 
of numbered rows of data. Such a set of numbered rows is local query and applying the local query to cache database 
termed a rowset. Database tables are of course one example 55 347 may be divided among data access layer 349 query 
of rowsets; others are files of data records, text files, and still dispatcher 351, and query analyzer 313 in many different 
and moving image data. If server 203(f) is used with data ways; the advantage of the technique of flowchart 601 is that 
sources having only a single kind of queryable data, query- data access layer 349 can employ the same mechanisms to 
able cache 219 need only be set up to deal with that kind of make local queries as it does to make global queries All 
queryable data 60 query analyzer 313 and query dispatcher 351 need do is 

It server 203(i) is used with data sources having more than supply data access layer 349 with the local context needed 

one land of queryable data, cache database 236 may be set to make the local query 

up using a rowset representation that will accommodate all Continuing with the details of queryable cache 302 and 

n™™ ere „ ? of 9 u «?«*'e*i«".lQlhatca S e,DA212, beginning with DAinterface 304, interface 304 receives a 

' ^ "ft . r6CeiVer ^ 1131151316 betWeeD 65 ^bal context 318 from query dispatcher 351 and depending 
the results and update quenes received from the various data on whether the datasets for the queries are in cache databa^ 
sources and the representations used in cached data 236. In 347, provides either local context 316 or a miss s^al 319 
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DA interface 304 has two main components: query analyzer change queue 333, from which they are executed by 

313 and cache database description manager 303. refresher 331 as described above. DSM 323 further updates 

Query analyzer 313 analyzes global contexts received CDB description 305 as required by the changes it makes in 

from data access layer 253 and other components of embodi- cacne database 347, as shown at 327. 

ment 301 to obtain the global context's global dataset s 10 a preferred embodiment, DSM 323 and refresher 331 

identifiers. Having obtained the global dataset identifiers, nave meir own threads or processes. It should also be 

query analyzer 313 provides them to CDB description pointed out here that CDB description 305 and change queue 

manager 303, which looks them up in cache database 333 ccmld be implemented as database tables in cache 

description 305 . Cache database description 305 is a table of database 347. Because these components are implemented 

datasets. At a minimum, there is an entry in the table for each 1° md ependently of cache database 347 and because abstract 

dataset that has a copy in cache database 347. Each such SS^ tra = s . latcr 339 * ?» d «f ™ to cache database 

entry contains the dataset's global identifier and its local 1%^^% * , k * T ^ md ? end6n i of ,ne 

identifier. The table also contains query information 307 h!^ X 17 FT™ ??W l ° "^IL 1 

mt , ■ ,. , n , .. H ' " J """"" uu cache database 347. In embodiment 301, data access 203 

* \ 1 maBa ? er 3 °u ^ 6n I*™ 08 m mdicatl ° n of only provides read queries to data access interface 304 All 
wheOier the dataset ts m cache database 347 (B/M 311) If « up H qlleries go dLtly to server 23^1^1^ 

it ts not, the query cannot be run on cache database 347, but being entered in cache database 347. In other embodiments 

must be run on source database 241, and consequently, query queryable cache 219 may be implemented as a writethrough 

analyzer 313 returns a miss signal 319 to query dispatcher cache, i.e., the update may be entered in cache database 347 

351. It the query can be run on cache database 347, query and also sent to server237. It should be pointed out here that 
analyzer 313 returns a hit signal 319 and also returns local 20 most Web applications are mostly-read applications, that is, 

context 316 for the query. As indicated above, query dis- a Web user typically spends far more time reading informa- 

patcher 351 then provides local context 316 to data access lion than he or she does changing it. For instance, in Web 

layer 349, which uses it to make local query 317 on cache commerce, the "shopping" is mostly a matter of reading 

database 347. Cache database 347 then returns local result HTML pages, with updates happening only when the user 

320 to data access layer 349. 25 adds something to his or her "shopping cart" or makes his or 

FIG. 5 shows details of CDB description 305. In a her purchases. In a system such as system 201, only making 

preferred embodiment, it is a table which has at least aD toe purchases would typically involve an update of source 

entry 501 for each dataset of source database 241 of which database 241. 

here is a copy in cache database 347. Each entry 501 Details 0 f Source database Server 237- FIG 4 

contains the global dataset identifier for the data set, by 30 CI „ . 

which the dataset is known in all servers 107 with queryable shows a preferred embodiment of source database 

caches 219 containing copies of the dataset, the local data set 237 ' SoUrCe database server 237 ra me preferred 

identifier 505, by which the dataset is known in cache embod iment is implemented by means of an Oracle8 server 

database 347, and number of queries 507, which indicates ^ecuting on a computer system that includes a disk drive 

the number of times the dataset has been queried over an 35 0n wbich K st °red source database 241 and memory 415 

interval of time. In the preferred embodiment, number of ^"5 contains buffer cache 407 for copies of data values 

queries 507 embodies query information 307 21 froin database 241 and dictionary cache 409 for copies 

An entry 501(0 for a given dataset is accessed in a of t , metadata b ™ database Metadata is database tables 

preferred embodiment by atsh function 503 which takes of cTch^d f ^ ^ ^ ^ Wtitcbacks 

global dataset ID 507 for the dataset and hashes it tZZ 40 1°?^ ^ mem0ry 415 to ^ database 241 

entry index 509 in table 305. CDB description manager 303 c L^lfc ' Tr^ T"* ^ ° f Pr °" 

then searches table 305 for the entry 501 whose field 503 f^Lh hi' ' ' } "P* 8 "}* ^ co ™sponds <o a server 

specifies global DSID 511 beginning at entry index 509 If d ""j"" l^^^g from cache misses, update 

347 and CDB description manager 303 signals a miss 311 to 45 h ^. Dispatcher 311 gives each of these processes in 
query analyzer 313. Table 305 may also kclude enties 50^ f ? T ° sh ™ d *™P™<^™. which performs the 
for global datasets that are not presently cached Se ^ ^ t he results to tne querymg process> 

database347 ; insuchentries,localda,a S etID505hasanull ctSponSsTS? ™ * ** 

value and a miss is returned in response to the null value. ^spondjr^ server 203. 

The purpose of such entries is to maintain number of queries 50 ,,4^ . f ™PlementaUon of source database server 

information 507 for such data sets, so that dataset manager 1 ^ * S ^ , ° racle8 database s y ste m to which has been 

323 can determine whether to add the entry's dataset to a ° UD P lementatl0n of update transmitter 243, which 

cache database 347. automatically sends an update to queryable cache 219 in 

iw>te d^^ mi • j. • .. each of the servers 203(0... n) when data in source database 

sou« d tabasl ferler £7 from dt Z^^f * 55 241 haS bcCD ^ t0 cached data 223 ™< 
source database server 237 from data access 253 and uses components of updater 243 in FIG 4 are labeled with the 

query analyzer 313 to determine whether the dataset affected reference number 241 t ', a 5 e . labeled 7 th tne 

bythe update is in cache database 347. If it is not, update T^ll^T^Tint^Z 

333 LdexecuteTits Series ' 60 ^ U '/ * paction to be taken if a 

predefined change occurs in a data value or an item of 
Data store manager 323 uses query information 307 in metadata in the database. Many database systems permit 
CDB description 305 to determine what datasets to add to or definition of triggers; triggers in the Oracle8 database sys- 
rlc*!?™ CaChe database 347 ■ Witn data se's to be added, tern are described in detail at pages 17-1 through 17-17 of 
DSM 323 makes the necessary queries to source database 6S the Leverenz reference 

24 jr d WheD arriV .f' °l M 3 ? mlkeS ^ m kt ° ^ ^ P refened embodiment, when a process 401(0 

update queries 239 and provides the update queries 329 to corresponding to a server 203(0 receives a query from DSM 
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323 in server 203(i) for data to be added to server 203(t)'s determination, would obtain it from the source server. As 

cached daia 223, process 203(i) executes set trigger code also described above, a send component on the source server 

403. This code sets an Oracle8 AFTER row trigger in could make the same determination and send the document 

metadata 417 for each row of data and/or metadata specified to the caching servers 

in the query. Shared server process 317 takes the action s For update purposes, the source server would simply 

specified in the tagger whenever the tagger's row of data maintam a fc, of the documents that were presently in the 

has been modified. The action specified for the trigger is to caching xnea . if one of me documents on the hst was 

send a message to each of the servers 203(0 . . . n) with an updatedi updater 24 3 would send the new version of the 

update query that modifies the data in cached data 223 in the document to the cacnin x where DSM 2U j 

same fashion as it was modified m source database 241. In to replace any copy of the document in the cache with the new 

tLlfHZ r 1 lmen '' T t P erf T ed by ^ c °Py- ^ tedmiqu« just described for documents could of 

trigger is to place the message with the update query in course also be used with files and with audio, image and 

message queue 414, which is implemented as an Oracle8 mouon p icture data 6 ' 

advanced queue. Message queue 414 is read by update 

process 402, which sends the messages in queue 414 to each 15 Making Cache Misses Faster 

of the servers 203(0 . . . n). .. . , 

.... , ; . ' , „„„ . Most dynamic caches are "load on miss" caches, that is, 

Addmg new data to cached data 223 in response to or in the system to which the cache belongs presumes that data 

anticipation of changes m the behavior of the users of that is being referenced is in the cache, and if a miss occurs 

internet 105 and updating cached data 223 in response to indicating that the data is not in the cache, the system finds 

changes in source database 241 may of course be imple- 20 the data> fetchcs it> ud loads it intQ the ^ Jn ' a „ load Qn 

7nl ^ir^ in P referred . emb ° d ™«t miss" cache, only the first reference to uncached data results 

shown in FIGS. 3 and 4. For example, determming what data in a miss and on the miss, the time it takes to determine 

™ 2 l7 n h t U 223 d0De ™ ST* ° BS Whether a miss has occurred » ° ot ™P^an« compared™* 

Zhi Lr, t ,k W I™??, ? C S T^ rS 2 ° 3 - S0UrCC ^ time il takes t0 fe,ch > 1°^ the data into the 

database 241, like the cached databases 347 in the servers 25 C iche. 

203(0 ... n), can maintain statistics information, and a send parhp w m i, a ■ « «, j 

process 404 in source server 237 can analyze the statistics in u 1 ? ' u j \ I \ " ™ l ' l0ad ° n ^ 

substantially the same fashion as described for DSM 323 ™ hI ^T' ^ ^ in Cache 

determine what data should be sent to the serve s 203(0 ! M*^TZ ^ ,h T^*™' Which ^ 

n) for caching in cached data 223, make update queries for 30 wSjf^^S m6 ^T°? '° ^T™ 

that data, and place messages containing the update queries Z , It T, Pr °^ bly be qU f?f d ln ^ near 

in message queue 414, from which update process 402 ca^ L ?, ^ * m ^ not even P ut 

send them to the servers 203 frequently-queried data mto cache 302. For example, if the 

Uodatins cached H a t a m ;„ „ c „ . dala in question is changtog at a rapid rate m source database 

updates that have been performed on source database 241 r> • ,„ 

The database system maintains the log so that it can redo iJ^T? T Z 302 doeS n0 f result m data bein 8 

updates in case of system failure, but the log can also be used ™ t Qt ° ^ ^ C0S ! ° f dete ™ming whether a miss 

to update cached data 223. If there is a table in s^rce ^ ^.u^* ^ l0 f ™ °* °° St ° f findin S> Etching, and 

database 241 which describes cached data 223 update ' oadm 6 ^ ta int ° the cache. Moreover, because a miss 

process 402 can use the table in conjunction with 'redo log ' DOt . data b T g loaded ' there wil1 g eneraU y be 

413 to determine whether an update in redo log affects ^ ™ 3 ° D misS " Cache - For both of 
cached data 223. If it does, update process 402 can send a « f rea sons, there is a need in cache 302 to reduce the cost 

copy of the update query to the servers 203 as just described ?u ! ^ 50 g6Deral ; a f ery 10 database 241 

1 ~-""*u- that comes about as a result of a miss on cache 302 takes 

Caching Servers and Source Servers That Do Not little or no more time than a query that is simply made 

Involve Database Systems directly to source database 241. 
The techniques used to determine what data should be 50 FIG - 7 shows the apparatus used in a preferred embodi- 

cached in server 203 and to update cached data 223 can also ment of cache 302 to reduce the cost of the miss. Cache miss 

be employed in systems where the data is not queryable. For accelerator 701 is a component of query analyzer 313. The 

example, the source data may simply be a collection of cmef component of cache miss accelerator 701 is miss table 

documents, identified perhaps by a document number (such 719 > wn ' c h contains a number of miss table entries 721. 

as its URL, if the document is an HTML page), and the ss Each ^ble entry 721 represents a query made to a 

cached data may be simply a subset of the collection. What rowset in a source server 237. The entry 721 for a given 

cache web application 211 would receive from HTML ? uer y 31113 source server contains a state value 723 which 

component 109 in such a system would simply be the indicates one of at least the following: 

document number for a document; if it is present in the a hit: the rowset specified by the query is in cache 302- 
cached data, the caching server would return it from there; 60 a miss: the rowset specified by the query is not in cache 

otherwise, it would fetch it from the source server. Query log 302; 

205 in such a case would be a time-stamped list of the unknown: it is unknown whether the rowset specified by 

documents that had been requested, together with an indi- the query is in cache 302 specineo Dy 

DSM 2°n^n e ^ r h the d °T^ m T " £ C U 3 qUery has a miss table entr y 721 in miss table 719 and 
£trHh i Tk ^^^^ho^ment would detenmne as 65 state value 723 indicates that the rowset specified by the 

desenbed above for the database whether a document should query is not in cache 302, there is no need! search cache 

be included in the cached data, and having made the database description table 305 to determine whether Z 



03/29/2004, EAST Version: 1.4.1 



US 6,487,641 Bl 
13 14 

rowset is in the cache. The difference in the amount of time reports a miss to query analyzer 313, which passes it on to 

it takes to make a query to source database 241 that results data access layer 349, as indicated by arrow 725, which, as 

from a miss and the amount of time it takes to make a direct indicated by the reference number in parentheses, performs 

query to source database 241 is simply the time it takes to the function of hit/miss indicator 319 of FIG. 3. Query 
find the query's entry 721 in miss table 719. s analyzer 313 further responds to the miss by not passing 

The remaining components of cache miss accelerator 701 global context 318 on to CDB description manager 303 

serve to accelerate the process of finding a query's entry 721 If the proper MTE 721's state value 723 indicates a hit, 

in miss table 719 and to maintain miss table 719. As miss table manager 711 reports that as well to query analyzer 

described above, a query as originally received in data 313, which passes the result to data access layer 349. These 
access layer 349 has a global context 318, which specifies w components then function as previously described, with 

the query in the terms required for source database 214. A query analyzer 313 providing global context 318 to CDB 

query to cache database 347 must, however, specify the description manager 303, CDB description manager 303 

query in the terms required for cache database 347. These returning local context 316 to query analyzer 313 and query 

terms are the query's local context 316. One of the tasks of analyzer 313 providing it to data access layer 349 which 
query analyzer 313 is to parse the global context into its is uses local context 316 to make a local query 317 to query 

components, so that CDB description manager 303 can cache database 347. 

translate the global context 318 into the corresponding local If the proper MTE 721's state value 723 indicates 

context 316. This parsing task is performed by GC parser "unknown", miss table manager 711 reports that fact to 

702, which divides global context 318 into user context 703, query analyzer 313, as indicated by arrow 712 and query 
which identifies the user(s) making the query, server context 20 analyzer 313 provides CDB description manager 303 with 

705, which identifies source database 214, and SQL query MTE 721's tuple <703,705,707>. CDB description manager 

707 which is the SQL statement specifying the query to be 303 then searches CDB descriptions 305 and CDB descrip- 

made on source database 214. managcr 303 provides B7M 311 to query analyzer 313 

All three components of global context 318 are necessary according to the results of the search. Query analyzer 313 

to completely characterize a query, so all three are provided 25 then provides H/M 311 to miss table manager 711 If HTM 

to accelerator 701. Each miss table entry 721 corresponds to 311 indicates that the rowset represented by tuple <703 705 

a <user, server query>tuple. Thus, each miss table entry 721 707> is in cache database 347, miss table manager 711 sets" 

includes m addition to status field 723 a field 703 indicating status field 723 in MTE 721 to indicate a hit; query analyzer 

tT/^ C ° fi nt L Xt ™ a , i 705 indicatin S a source database 3" also indicates a hit via 319 to data access layer 349 and 

214, and a field 707 indicating an SQL query. An important 30 uses CDB description manager 303 to obtain local context 

property of accelerator 701 is that there is only one miss 316 for data access layer 349. If H/M 311 indicates a miss 

72-^17 , m 3 g l VC ° <user ', s*™' q^rpMe- ST query analyzer 313 provides the miss to miss table manage; 

723 in that miss table entry thus makes the experience of any 711 and to data access layer 349, and miss table manager 711 

entity making a query with the given tuple available to all sets status field 723 to indicate a miss 

entities makmg queries with the given tuple. This property 35 If there is no entry in miss table 719 for tuple <703 705 

of accelerator 701 is particularly important in situations 707> from global context 318, miss table manager' 711 

7^1 SLTh 18 ^ y f Inte , me ' Pr ° t0C01 addreSS makeS entr y md places i( at »»* «dex returned by hash 

Miss table manager 711 is a collection of routines which the entry and miss table manager 711 then proceeds as 

provides the interface between miss table 711 and the rest of 40 described above for MTEs 721 with state Talues 72^ Tndi! 

query analyzer 313. Miss table manager 711's most frequent eating "unknown " 

en,™™ f S readiD8 ^ °l StatU V 23 ta miSS taWe In a P referred «nbodiment, there is a fixed number of 

entry 721 for a given source database and query and report- MTEs 721 and when a new MTE 721 is required but none 

I T\ mdlCateS ^ 01 ^ l ° qUMy isava ilablemmisst a ble719,mernisstablemanager^ 

? * £o ™ m PaSS T thC rcp ° rt 10 daU aCCeSS 45 a MTE 721 curren "y » reuse. The selection is 

Uyer 349. Tne operation proceeds as follows: When a user done on the basis of least frequent use. Miss table ™ 

wishes to query the source database the user provides the 711 keeps track of the frequency of use of MTEs 72?^d o 

£J II r CeS f 'Til « 49 ' QU6ry 351 ^ Status TO. A^e of « MTE 721 with a status of 

ttien provides global context 318 to query analyzer 313. "miss" has a much higher weight than a use of an MTE witf, 
Parser 702 parses global context 318 into its components, 50 the status "hit" or the status "unknown", and mu^MTEsT21 

st™ g0 10 m ^ table manager J, 11 ;; as ^ m6 status " miss " tend to sta y in m ^ ™ ^ 

shown in FIG. 701. To ensure rapid access, mess table 719 than MTEs 721 that have one of the other statuses 

is implemented as a hash table, using techniques that are In addition to making MTEs 721 and providme and 

SSS Sertc 70^ Ti^V^ ^ ^ *«* * MTOs 721 > 

?7 7 ™£ ^."S 703 ' and «Q L 1 ner y 707 to a hash 55 711 has to ensure that the contents of miss table 719 track the 

function 709, which returns an index value 718 correspond- contents of CDB description 305 In a pre^rred 

7? ^,S° 5 ' 707> - ^ * 3 MTE 721 f ° r lhat emb °<^ this is done in response to a eachTchange 
w!l <? ?' '■ mSS J tabl6 manager J U Can ***** cvent which CDB description manager 303 provides to 

locate the entry using mdex value 718. To determine query analyzer 313 whenever a copy is added to debase 

™e 7?, Pr ° Per MTE721 , has been found, miss table 60 347. Query analyzer 313 provides the'eache change even,^ 

7^ 7fL reiT^ f ^ V ' ™ ° f ' hC tUplC <7 ° 3 ' Valuc 723 * 31101 thc in miss tab 'e 719 whose status 

705,707> received from GC parser 702. value 723 indicates "miss" to "unknown". As queries cor- 

Once miss table manager 711 has located the proper MTE responding to the MTEs 721 come in, miss table manager 
'nHilT.h^t manage . r 11 reads , status 723 ' If il 65 7U deals ^ thos « MTEs 721 as described above for 

^7r,V 7 ^7 "7 Tr""^^ eDtry ' S mirtc MTEs 721 ^ °" ^ value "unknown". In other 
<703,705,707> is not ,n cache 302, miss table manager 705 embodiments, CDB description manager 303 may provide a 
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description of the added copy along with the cache change 
event, and in such embodiments, miss table manager 711 
may only change the status of MTEs 721 representing 
queries that are affected by the presence of the added copy. 

In a preferred embodiment, the status value 723 of MTE 
721 for a tuple <703,705,707> is part of query information 
307; consequently, as indicated by arrow 715, miss table 
manager 711 periodically provides the state of all of the 
entries in MTE 721 to CDB description manager 303. 

The semantics of miss table 719 for the case when there 
is an entry in miss table 719 corresponding to tuple <703, 
705,767> from the query are shown in overview in flowchart 
801 of FIG. 8. The relevant routine of miss table manager 
711 is invoked at 803 with the <user,server,query>tuple; at 
805, Status 723 for the corresponding MTE 721 is retrieved 
from miss table 719; at 807, the value of status 703 is used 15 
to determine whether miss branch 809, hit branch 815, or 
unknown branch 821 will be taken. In miss branch 809, the 
miss is signaled to data access layer 349 at 811 and at 813, 
the data access layer queries the data source. In hit branch 
815, the hit is signaled to data access layer 349 at 817 and 20 
at 819, the data access layer queries the cache. In unknown 
branch 821, the cache is searched for the rowset correspond- 
ing to the query at 823; at 825, the results of the search are 
signaled to the data access layer, which queries the cache or 



16 



the data source accordingly. At 827, status 723 is updated in 25 wherein: 



the network server having the improvement comprising: 
a miss table that relates the rowset specifier to a status 
indicator, the status indicator being able to indicate at 
least whether the copy is in the cache, the network 
server using the miss table prior to applying the 
rowset specifier to the cache to determine whether 
the copy is in the cache, and when cot, responding to 
the rowset specifier by fetching the rowset from the 
remote location. 

2. The network server set forth in claim 1 wherein: 

the miss table employs an entry that includes at least the 
status indicator to relate the rowset specifier to the 
status indicator; and 

the status indicator is further able to indicate that it is 
unknown whether the copy is in the cache, 

the network server responding when the status indicator 
indicates that it is unknown whether the copy is in the 
cache by searching for the copy in the cache and setting 
the status indicator in the entry according to whether 
the copy is in the cache. 

3. The improved network server set forth in claim 2 
wherein: 

the network server further fetches the copy from the cache 
when the copy is therein. 

4. The improved network server set forth in claim 2 



the MTE in accordance with the result. The routine returns 
at 829. 

Conclusion 

The foregoing Detailed Description has disclosed to those 30 
skilled in the art of caching data how a miss table may be 
used to speed up references to uncached data in a dynamic 
cache which is not loaded on miss and how such a miss table 
may be used with a queryable cache in a network server. The 
Detailed Description has further disclosed the best mode 35 
presently known to the inventors of practicing their inven- 
tion. It will, however, be immediately apparent to those 
skilled in the art of caching data that many features of the 
embodiment of the miss table disclosed in the Detailed 
Description are consequences of the environment provided 40 
by the queryable cache for which it has been implemented 
and that embodiments can be made for other environments 
which will work according to the principles of the miss table 
disclosed herein, but will otherwise differ substantially from 
the embodiment disclosed herein. It is also the case that 45 
many different techniques are known in the art for imple- 
menting tables and for accelerating the process of locating 
an entry in a table, and many of these techniques can be used 
in implementations of miss tables that work according to the 
principles of the miss table disclosed herein. Moreover, in 50 
other embodiments, the miss table may have statuses in 
addition to the miss status that differ from those of the miss 
table disclosed herein and different techniques may be used 
to decide when a table entry is to be reused or how the miss 
table should be updated when the data in the cache changes. 55 

For all of the foregoing reasons, the Detailed Description 
is to be regarded as being in all respects exemplary and not 
restrictive, and the breadth of the invention disclosed here in 
is to be determined not from the Detailed Description, but 
rather from the claims as interpreted with the full breadth 60 
permitted by the patent laws. 

What is claimed is: 

1. An improved network server of a type that includes a 
cache containing a copy of a rowset from a remote location, 
the network server responding to a rowset specifier speci- 65 
fying the remote location and the rowset therein by provid- 
ing the copy from the cache when the copy is therein, 



when a copy is added to the cache and the status indicator 
affected thereby currently indicates that the copy is not 
in the cache, the network server sets at least that status 
indicator to indicate unknown. 

5. The improved network server set forth in claim 4 
wherein: 

the miss table further comprises a plurality of the miss 
table entries, each entry having an index; and 

when the network server responds to the rowset specifier, 
the network server hashes the rowset specifier to obtain 
an index of a miss table entry. 

6. The improved network server set forth in any one of 
claims 1 through 5 wherein: 

the cache and the remote location are queryable; and 
the rowset specifier specifies a query. 

7. An improved method of obtaining a rowset stored in a 
remote data source in response to a rowset specifier that 
specifies the remote data source and the rowset therein by 
performing the steps of 

applying the rowset specifier to a local cache to retrieve 
a copy of the rowset therefrom; and retrieving the 
rowset from the remote data source only if the copy is 
not in the local cache, the improved method further 
comprising the step of: 

prior to the step of applying the rowset specifier to the 
local cache, applying the rowset specifier to a miss 
table that relates the rowset specifier to a status indi- 
cator indicating at least whether the copy is in the local 
cache, 

the step of applying the rowset specifier to the local cache 

being performed only if the status indicator indicates 

that the copy is in the local cache. 
8. The method set forth in claim 7 wherein 
the miss table employs an entry that includes at least the 

status indicator to relate the rowset specifier to the, 

status indicator; and 
the status indicator is further capable of indicating that it 

is unknown whether the copy is in the cache; and 
the method further comprises the step performed when the 

status indicator indicates that it is unknown whether the 

copy is in the cache of: 
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searching for Ihe copy in the cache and setting the 
status indicator in the entry according to whether the 
copy is in the cache. 

9. The method set forth in claim 8 further comprising the 
step of: 5 

fetching the copy from the cache when the copy is therein. 

10. The method set forth in claim 8 further comprising the 
step of: 

when a copy is added to the cache and the status indicator 
affected thereby currently indicates that the copy is not 10 
in the cache, setting at least that status indicator to 
indicate unknown. 

11. The method set forth in claim 10 wherein 

the miss table further comprises a plurality of the miss 
table entries, each entry having an index; and 

the method further comprises the step of: 
hashing the rowset specifier to obtain an index of a miss 
table entry. 

12. The method set forth in any one of claims 7 through 20 
11 wherein: 

the cache and the remote location are queryable; and 
the rowset specifier specifies a query. 

13. Apparatus that fetches items of data from a remote 
location, the apparatus comprising: 25 

a cache that stores copies of the items and provides an 
item's copy in response to an item specifier for the 
item; 

a miss table that relates an item specifier to a status 
indicator that indicates at least whether there is a copy 30 
of the item specified by the item specifier in the cache; 
and 

a dispatcher that responds to the item specifier by pre- 
senting the item specifier to the miss table prior to 
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applying the item specifier to the cache and on receiv- 
ing an indication from the miss table that there is no 
copy in the cache, fetching the item of data from the 
remote location. 

14. The apparatus set forth in claim 13 wherein the 
apparatus further comprises: 

a miss table manager that modifies the miss table; 

the miss table employs an entry that includes at least the 

status indicator to relate the rowset specifier to the 

status indicator; 

the status indicator may further indicate that it is unknown 
whether there is a copy of the item; and 

when the status indicator indicates that it is unknown 
whether there is a copy of the item, the cache responds 
to the item specifier and provides an indication whether 
there is a copy of the item to the miss table manager, the 
miss table manager updating the miss table in accor- 
dance with the indication. 

15. The apparatus set forth in claim 14 wherein: 

the cache provides a change event notification to the miss 
table manager when a copy of a data item has been 
added to the cache; and 

when the status indicator affected thereby currently indi- 
cates that the copy is not in the cache, the miss table 
manager responds to the change event notification by 
setting the status indicator to indicate unknown. 

16. The apparatus set forth in any one of claims 13 
through 15 wherein: 

the cache and the remote location are queryable; and 
the item specifier specifies a query. 
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syntax 272 of the index module and search modules utilized. 
For complex topic areas that are of high priority to the 
community, the predefined search atoms typically is com- 
bined into molecules manually 274. This simulates the 
action of the query builder for a given profile. Subsequently, S 
the search atoms or search molecules should be iteratively 
tested to ensure the accuracy of results 276. The lexicon 
developer can allow a user to choose the high-level topic 
areas 278 and create or personalize a target profile 280 in 
order to develop a default target profile. Feedback can be 10 
gathered from the community to assist with the development 
or refinement of the community lexicon predefined search 
queries 282. For ongoing maintenance and improvement, 
reports from the pattern analysis module that uncover rela- 
tionships or statistical occurrences typically are examined 15 
284 and used to refine the community lexicon. The reports 
from the pattern analysis module may also be analyzed for 
the individualized lexicon free form search queries 286. At 
this point, if the lexicon can effectively retrieve electronic 
objects placed on the community's interests and topics, 20 
lexicon development is complete 288. 

FIG. 12 details the procedure followed by the pattern 
analysis module when analyzing the electronic objects 
returned by the searching subsystem. The pattern analysis 
module first reads 300 metadata, indexes, abstracts, and ^ 
ratings of the retrieved electronic objects stored in the index 
table. Statistics are produced 301 describing the frequency 
and patterns of terms in each object. The pattern analysis 
module may, more specifically, find the occurrence of lexi- 
con terms 314. The occurrence of lexicon terms in associa- 30 
tion with other lexicon terms in electronic objects may also 
be compiled. These data can then be used for many different 
analysis purposes. The pattern analysis module itself may 
perform further processing of the data. In the alternative, the 
pattern analysis module may be configured to provide a 35 
pipeline of associated terminology for data analysis by other 
modules that may be added to the system. 

One type of data analysis that can be performed by the 
pattern analysis module or an additional module is to 
identify improvements that could be made to the lexicon. 40 
The electronic objects may be identified based on the 
frequency 302 of selection. For the electronic object, the 
underlying search elements 304 are identified and custom 
search queries are generated 306 which are associated with 
the personalized topic area. The pattern analysis module is 45 
then able to automatically add 308 the custom search 
elements to the user target profile or recommend 310 the 
custom search elements to the system administrator. 

A second type of analysis that could be performed by the 
pattern analysis module is to identify and store 312 the 50 
popularity of each object. This allows users to select those 
topic areas which are most popular within the community. 

A typical pattern analysis module also identifies occur- 
rences of patterns 314 of certain lexicon terms. The statistics 5S 
316 of the occurrence patterns of the lexicon terms are then 
stored and can be used to refine the lexicon or be fed to other 
processing modules. 

An example of further processing of the frequency of 
occurrence of terms would be to find a pattern over time in 60 
the electronic documents. For example, the pattern of 
merger and acquisition venture activity could be identified 
over time for a specific company or specific industry. The 
pattern analysis module, or an added module, may be used 
to identify the other companies involved in the merger and 65 
acquisition ventures, the intensity of the activity, the other 
industries involved, or other useful information which 
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involves plotting the occurrence of available lexicon terms 
or user-specified terms. 

The various embodiments described above are provided 
by way of illustration only and should not be construed to 
limit the invention. Those skilled in the art will readily 
recognize the various modifications and changes which may 
be made to the present invention without strictly following 
the exemplary embodiments illustrated and described 
herein, and without departing from the true spirit and scope 
of the present invention, which is set forth in the following 
claims. 

What is claimed: 

1. A system for search and retrieval of electronic objects, 
the objects including electronically encoded information, the 
system comprising: 

a searching subsystem comprising 
one or more electronic lexicons in a memory within the 
system, wherein each lexicon is configured to pro- 
vide predefined search elements designed to identify 
objects relevant to a specific community; and 
a format filter subsystem coupled to the searching 
subsystem comprising a plurality of format filter 
modules operable with the lexicon and configured to 
identify a format of an electronic object and to select 
a corresponding one of the format filter modules that 
will enable the system to search the object using the 
search elements within the lexicon; 
a profile management subsystem coupled to the lexicon 
comprising a community module, a profile module, 
and an atlas module, wherein the community module 
is configured to enable selection of a community 
lexicon, wherein each community lexicon includes a 
library of topics and search elements, wherein the 
profile module is configured to enable creation of a 
topic profile by selecting at least one topic from a 
library of topics, wherein each topic identifies a 
subject that is relevant to the information needs of 
the community, and wherein the atlas module is 
configured to enable creation of a user atlas by 
indicating at least one preferred data resource from a 
list of data resources from which objects may be 
retrieved; 

whereby potential sources of information can be easily 
searched and relevant information can be retrieved for a 
user. 

2. The system of claim 1, wherein the searching sub- 
system further comprises a community module configured to 
enable selection of a lexicon, wherein each lexicon stores a 
library of topics and corresponding search elements. 

3. The system of claim 2, wherein each topic within the 
library of topics is associated with one or more of the 
predefined search elements within the lexicon, and wherein 
each topic identifies a subject that is relevant to the infor- 
mation needs of the community. 

4. The system of claim I, wherein the searching subsystem 
further comprises a profile module configured to enable 
creation of a target profile by selecting at least one topic 
from a library of topics, wherein each topic is associated 
with one or more of the predefined search logic elements and 
each topic identifies a subject or concept of interest that is 
relevant to the information needs of the community. 

5. The system of claim 1, wherein the searching sub- 
system further comprises an atlas module configured to 
enable creation of a user atlas by selecting at least one 
preferred data resource from a list of data resources from 
which objects may be retrieved. 

6. The system of claim 1, wherein the searching sub- 
system further comprises a query builder module which 
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accesses a target profile, wherein the target profile lists at 
least one topic from a library of topics, wherein each topic 
is associated with one or more of the predefined search 
elements and each topic identifies a subject or concept of 
interest that is relevant to the information needs of the 5 
community, and wherein the query builder module is con- 
figured to create an electronic master search query by 
concatenating the search elements associated with each topic 
listed in the target profile. 

7. The system of claim 6, wherein a master search module 10 
is configured to use the electronic master search query to 
search at least one electronic object within at least one 
database listed in a user atlas. 

8. The system of claim 7, wherein the master search 
module is scheduled to automatically search for electronic 15 
objects at time intervals. 

9. The system of claim 1, further comprising a retrieval 
subsystem comprising a retrieval module configured to 
select the corresponding one of the format filter modules for 
each object identified by the searching subsystem and 2 o 
deliver each object to the user in a viewing format. 

10. The system of claim 1, further comprising an indexing 
subsystem comprising an indexing module configured to 
create an index of each object identified by the searching 
subsystem by compiling and storing in computer readable ^ 
medium summary information that identifies the object; 

whereby the system can quickly search the index of the 
indexed object. 

11. The system of claim 1, further comprising a pattern 
analysis subsystem comprising a pattern analysis module 30 
configured to parse through the objects identified by the 
searching subsystem, and recognize and count words within 
each object that are in the lexicon. 

12. The system of claim 1, further comprising a pattern 
analysis subsystem configured to locate additional terms 3; 
within the identified objects according to frequency and 
location of the terms in relation to words within each object 
that are in the lexicon. 

13. The system of claim 1, further comprising a pattern 
analysis subsystem configured to record a number of times 4c 
that each object has been retrieved by the system. 

14. A system for search and retrieval of electronic objects, 
the objects including electronically encoded information, the 
system comprising: 

a searching subsystem comprising 45 
one or more electronic lexicons in a memory within the 
system, wherein each lexicon is configured to pro- 
vide predefined search logic elements designed to 
identify objects relevant to a specific community and 
topic; ' 50 

a format filter subsystem coupled to the lexicon com- 
prising a plurality of format filter modules operable 
with the lexicon and configured to identify a format 
of an electronic object and to select a corresponding 
one of the format filter modules that will enable the 55 
system to search the object using the search elements 
within the lexicon; and 
a profile management subsystem coupled to the lexicon 
comprising a community module, a profile module, 
and an atlas module, wherein the community module 60 
is configured to enable selection of a community 
lexicon, wherein each community lexicon includes a 
library of topics and search elements, wherein the 
profile module is configured to enable creation of a 
topic profile by selecting at least one topic from a 65 
library of topics, wherein each topic identifies a 
subject that is relevant to the information needs of 
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the community, and wherein the atlas module is 
configured to enable creation of a user atlas by 
indicating at least one preferred data resource from a 
list of data resources from which objects may be 
retrieved; 

whereby potential sources of information can be easily 
searched by selecting relevant topics from a community 
lexicon and relevant information in many formats can be 
retrieved for a user. 

15. The system of claim 14, further comprising an index- 
ing subsystem comprising an indexing module configured to 
create an index of each object identified by the searching 
subsystem by compiling and storing in computer readable 
medium summary information that identifies each object 
located by the searching subsystem; 

whereby the system can quickly search the index of the 
indexed object. 

16. The system of claim 14, further comprising a pattern 
analysis subsystem comprising a pattern analysis module 
configured to sift through the objects identified by the 
searching subsystem, and recognize and count words within 
each object that are in the lexicon. 

17. The system of claim 14, further comprising a pattern 
analysis subsystem configured to locate additional terms 
within the identified objects according to frequency and 
location of the terms in relation to words within each object 
that are in the lexicon. 

18. The system of claim 14, further comprising a pattern 
analysis subsystem configured to record a number of times 
that each object has been retrieved by the system. 

19. A method for search and retrieval of electronic 
objects, the objects including electronically encoded 
information, the method comprising: 

identifying a format of an object to be searched; 
selecting a format filter module that is configured to 

enable searching of the object; and 
searching the object using predefined search elements 
found in an electronic lexicon stored in a memory, 
wherein each lexicon is configured to provide the 
predefined search elements designed to identify objects 
relevant to a specific community and topic; 
Managing a profile comprising a community module, a 
profile module, and an atlas module, wherein the com- 
munity module is configured to enable selection of a 
community lexicon, wherein each community lexicon 
includes a library of topics and search elements, 
wherein the profile module is configured to enable 
creation of a topic profile by selecting at least one topic 
from a library of topics, wherein each topic identifies a 
subject that is relevant to the information needs of the 
community and wherein the atlas module is configured 
to enable creation of a user atlas by indicating at least 
one preferred data resource from a list of data resources 
from which objects may be retrieved; 
whereby potential sources of information can be easily 
searched and relevant information can be retrieved for a 
user. 

20. The method of claim 19, further comprising retrieving 
the object identified in the searching step by using the 
selected format filter module to present the object to the user 
in a viewing format. 

21. The method of claim 19, the method further compris- 
ing selecting a community lexicon, wherein each commu- 
nity lexicon includes a library of topics corresponding 
search elements, wherein each topic within the library of 
topics is associated with one or more of the predefined 
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INFORMATION RETRIEVAL APPARATUS system retrieves the data with synonyms of words in 

AND A METHOD retrieval expression, the integration of character base and 

word base retrieval, and with the user's profile, but the 
FIELD OF THE INVENTION retrieval time will be long. 

Tt,„ ■ , ^ i . • c .■ . ■ . S As mentioned-above, when using the elaborate retrieval 

The present invention relates to an iiiformation retrieval al nth , he ^ takes a long toe 

apparatus and a method to present high quahty information However, if the retrieval results formerly obtaineTfs 

to a user quicKiy m a network system. provided, the retrieval results are poor to the user. 

BACKGROUND OF THE INVENTION SUMMARY OF THE INVENTION 

Recently, the World Wide Web ("WWW") is widely used • , Et fe *? obiect of ^ Present invention to provide an 

In the Internet or Intranet, electric information is easily ™ formatwn retrieval apparatus and a method to present at 

accessable. Therefore, a so-called information flood occurs leaSt ° De rctnCval result of t0 ^ ™* I""*** 

and the user can not easily extract his desired data from vast . Accordln g to *e present invention, there is provided an 

information. In order to reduce this problem, retrieval sys- is a" 01 ™ 1111011 retrieval apparatus, comprising: a receiving 

terns (i.e., search engines) appeared in the Internet or Intra- means for Kce,vm S. a retrieval request from a user; a first 

net. As a retrieval system, "Altavista" of DEC and "excite" retrieval means for retrieving information from a database in 

of Excite corporation are well known. When a retrieval response to the retrieval request; a retrieval result memory 

request is received from the user as a retrieval expression, means for storin S me information retrieved by said first 

the retrieval system presents the information (Web pages) 20 retn ? val mcans ; and a second retrieval means for retrieving 

which best related with the retrieval expression. In this case, detailed °r additional information from the database by 

a web page as a retrieval object is preserved as a format of using retrieval request in case a new retrieval request is 

an index table and quickly retrieved. However, if a large not received b y said receiving means, 

number of retrieval requests are received at once from many Further in accordance with the present invention, there is 

users in the world, it takes a long time to process all of them. 25 ^ P rov ided a method for retrieving information from a 

Accordingly, in order to quickly respond to the user for at database; comprising the steps of: receiving a retrieval 

least one retrieval request, the system caches the previous ' 6qUest om a VSCI '' retrieving the information from the 

retrieval result. In this technique, if the information is da ' abase .m response to the retrieval request; storing the 

retrieved from the database by the retrieval expression, the ^formation retrieved at the retrieving step in a retrieval 

information is stored in a storage device (cache memory) as 30 r f memory; and advanced retrieving information from 

the retrieval result. Hereafter, when the retrieval request is database by using the retrieval request in case new 

the same as before, at least one retrieval result is extracted retrley al request is not received at the receiving step, 

from the storage device and quickly presented to the user. In Further in accordance with the present invention, there is 

short, after the second retrieval request, the database is not 41/50 P rov ided a computer readable memory containing com- 

actually retrieved and the information is provided to the user 35 P uter readable instructions to retrieve information from a 

by referring to the cache content. This technique is disclosed database, comprising: an instruction means for causing a 

in Japanese Patent Disclosure (Kokai) H4-199468, computer to receive a retrieval request from a user; an 

H4-326163. s ' instruction means for causing a computer to retrieve the 

However, in this kind of the retrieval system the quality information . from ,he database in response to the retrieval 

of the retrieval result is not taken into consideration because 40 T™?'- *" ms,ructlon means for causm S a computer to store 

a fast retrieval and a high quahty of the retrieval result retrieved information in a retrieval result memory; and 

conflict with each other. In short, when an accurate retrieval f , ms ! n J ctlon means for causm g a computer to retrieve 

algorithm improves the quality of the retrieval result but it information from the database by using the retrieval 

takes a long time. On the other hand, in case of the retrieval req m CaSe neW retrieval re( * uest is not received, 

is executed by a profile (interest topic) by unit of the user, 45 BRIEF DESCRIPTION OF THE DRAWINGS 

the retrieval results stored in the cache memory is not FIG. 1 is a block diagram of a retrieval system incorpo- 

S n^ rK r ta f C la !; t<:rCaSe ' rCSp0nS6 tim6 latin S an information retrieval apparatus of Te pZi 

to present the retneval result to the user is long because the invention. present 
information is normally retrieved from the database. err i . ki« 1 j- r L • , 

v, . . • .. , aiauaac. HG. 2 is a block diagram of the mforrnation retrieval 

Now, characteristics of WWW retrieval in their present apparatus according to the first embodiment of the present 

condition will be analysed. invention. 

Fiistly.ahugenumberofrequestscomerandomly.Alarge FIGS. 3A and 3B show data stored in a retrieval result 
number of the retneval requests arc received through a memory section in the information retrieval apparatus in 
network, but it is necessary to quickly respond to these 5S FIG. 2. 

retrieval request. Each user respectively locates a terminal FIG. 4 is a flow chart of the processing of the information 

re?! ,h , CaCh retri6Val feqlleSt retrieval method accordi "S t0 ^ first embodiment the 

reaches the retrieval system at random. pKxnt invention. 

Secondly, same retrieval expressions are requested often. FIG. 5 is a schematic diagram of a time chart of a second 

A large number of retneval requests are not different each 60 retrieval algorithm according to a second embodiment of the 

other. About seventy percent of the retrieval expressions are present invention 

same. Especially in the WWW many users want to know FIG. 6 is a block diagram of the information retrieval 

I TiST'T^ * g 7 era1 ' thekmd , of -^information apparatus according to a Scond embodirnem of L presenl 

is limited and the retrieval requests often are the same invention 

retrieval expression. „ _ '. „ 

tt,;,j,„ ,.„ . ... .. . , . 65 rlO. 7 is a flow chart of the processing of the information 

Thirdly, the ^retrieval with a sophjsticated algorithm takes retrieval method according to the second embodiment of the 

a long time. The retrieval results will be better when the present invention. 
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FIG. 8 is a block diagram of the information retrieval 26. The first retrieval section 26 retrieves the information 

apparatus according to a third embodiment of the present from the database 28 by a fast retrieval algorithm, for 

invention. example, an algorithm of low retrieval accuracy. The first 

FIG. 9 shows interested topic data stored in a profile retrieval secdon 26 writes the retrieval result in the retrieval 

information memory section in the information retrieval s rcsult memorv section 24 and sends the retrieval result to the 

apparatus in FIG. 8. retrieval result reply section 25. In this case, when the 

itt/- mi,™ u t i . c , retrieval result is written in the retrieval result memory 

FIG. 10 shows number of retrieval request for each seclion ^ a fl fa ^ a check M& ^ f * 

™^^n 6SSl01 !H f V ^° fil6 , inform h atl ° n retrieval as shown in FIG. 3A. As an internal expression of 

memory section in the information retrieval apparatus in ,u„ „,„,„„ <•„, , , u a « , 

P IG g 10 me system, f°r example, the flag "1" is written as the 

retrieval completion. In FIG. 3A, the flag "done" is shown 

DETAILED DESCRIPTION OF THE as retrieval completion. In the first embodiment, the first 

PREFERRED EMBODIMENTS retrieval obtained by using the retrieval expression repre- 

^ , ,. . ... sents the retrieval based on the first retrieval algorithm 

Embodiments of the present invention will be explained (method). In case some data exists in the field 31 of the 

referring to the Figures. FIG. 1 1S a block diagram of system retrieval expression, the flag is also written in the check field 

mcorporatmgthe informaUon retrieval method of the present 32 0 f the first retrieval corresponding to the field 31 

invention. In FIG. 1 the user inputs the retrieval request Therefore, the check field 32 may be omitted by referring to 

from a user terminal 1 to a server 3 which is realized as the ^ retr ieval expression field 31 and the retrieval result field 

information retrieval apparatus of the present invention, 34. 

through a network 2. The server 3 obtains the retrieval result 20 r„ „.„ «v,« , » • 1. j. • .• ~~ . 

which coincides with the retrieval request by referring a nn ° ^ Z 1 T T det f™ natl ° n sec 10 " 22 d ° es 

large number of data (for example, a table information of ^ ^ Z^nTZ reS STh^ ^ 

Web page) stored in a database 4, and transmits at least one Z ?ZfJ?n Z \ 1 u ™ determination section 

retrieval result to the user terminal 1. In general, as the data f h ^tneval result search section 23 whether 
stored in the database 4, the Web page is converted to a 25 ^% et ™ v f ex P ^n. whose retneval result s obtained by 

format of an index table. However" the conversion is no J^S 1 . £ ^ mt * m ™ d , n °' obtamed b y ^ second 

limited to this format. In short, the retrieval object data may * Y ?aT ' " ! """l* ™* mem ° ry 

be stored by some format y secto ° 24 . Inrespome to me mquiry, the retrieval result 

c, r , • ... . ,. , . r . , search secUon 23 extracts the retrieval expression by refer- 
»J1 h a , ,l g T , ^ !° fornlat f n retri6val 30 ring to the retrieval result memory section 24 and transmits 

Son In°F?r g 2 when °* ^ retn6Val 6Xpression to the retrieval l W* determination 

invention^ In FIG. 2 when a retneval request receiving section 22. In this case, as shown in FIG. 3 A, the retrieval 

t^Z« T^T retl : lev ^l including a prede- expression, of which the checkfield 32 of thefirst algorithm 

e mmed retneval expression through the network 2, the is written by "done" and the check field 33 of the^cond 
retneval request receiving section 21 sends the retrieval 35 algorithm is not written, is extracted. If a plurality of the 

request to a retneval kind determ nation section 22. The ret rieval expressions are extracted, one "tribal exprel on 

meTott detennlnatl ° n s?*™ 22 executes later- « sdected by a predetermined standard. As the predX 

mentioned processing m case the retrieval request is not mine d standard, for example, the latest retrieval expression 

ctTthe Z t 1 ^ re 1 UeStrc ? lv l ng S6Cti0n 21 lD » -l e ^d. Otherwise, as shown in FIG. 3B dfe Ite 

deten^na^n n 'if 1 T^' ^ T^ 1 ^ « retrie ^ by the retrieval expression are counted ui 

tnX rP ?l f , ? h6 K r I6Val CXpreSS10D a field 35 and me retrieval ex P ressi °" whose number of 

from the retneval request and inquires of a request result retrieval requests is largest is selected. In FIG 3B the 

search section 23 whether a retneval by same retrieval retrieval egression «(BB and CC) or (EE and FF> is 

sXHctiTn a" m Tr 1 ^ retlieV f KSUlt SdeCted becaUSe ™ mber ° f reques the 

nh^l Jf T examines whether the retrieval result 45 largest value in the retrieval expressions not having the 

obtained by the same retneval expression is already stored seC ond retrieval flag "done" 

m a retrieval result memory section 24. wi™ .t, . • 1 • • 

a it> u , r j When the retneval type detenmnation section 22 receives 

. ' aA 1 * nd 3B show examples of data stored m the the retrieval expression from the retrieval result search 

^Zn^ZT y fT tl0 ( n24 Assbo wninFlG.3A,by sectioa 23, the retrieval type determination section 22 

referring to a field 31 of the retneval expression, the retrieval 50 requests a second retrieval section 27 to retrieve a further 

result search section 23 examines whether the same retrieval information by using the retrieval expression. The second 

^^ n l TflT,f ievedex P ressi ° D retrieval s«tion 27 retrieves the detail information from a 

£ stored in the field 31, the retneval result corresponding to database 28 by the secoud retrieval algorithm whose 

the same re rieval expression is extracted from a field 34 of retrieval accuracy is higher than the first retrieval algorithm 

secS O?^ 55 md ^ ^ detail Nation in the retrieval resuli 

section 25. On ttte other hand if the same retneval expres- memory section 24. In this case, as shown in FIG. 3A, a flag 

sion is not stored in the field 31, an instruction is sent to the "done" is written in the check field 33 

retneval kind determination section 22 to execute the later- nG . 4 a flow char1 of ^ • of m infonnation 

^n^y r ? Ce ? ln8 - ^ retn , eVal ^ K ? l y* Cd0a 25 retrieval melhod accord »g ^ the first embodtoe™t the 

ttansmiU at east one retneval result received from the 60 wtrievd type detenninatio\i section 22 deckles ^heS S 

Z^ZtZT sectlon 23 10 ^ tennMal 1 r f ™! ^ uest 1 received *>» «>■ ^ «5S E 

. .. t -', retneval request, the retrieval type determination section 22 

In case the retrieval type determination section 22 sends the instruction of receiving to the retrieval result 

receives the instruction of non-storing of the retrieval search section 23. The retrieval result search section 23 

expression from the retneval result search section 23, the 65 retrieves the retrieval result memory section 24 (step 42) 

retneval type determination section 22 requests a retrieval and decides whether the retrieval result corresponding to the 

based on the retneval expression to a first retrieval section retrieval expression in the retrieval request is stored in the 
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retrieval result memory section 24. (step 43). When the lates the load value of the system at the present time, and 
retrieval result is already stored, the retrieval result search sends the load value to the retrieval type determination 
section 23 extracts the retrieval result from the retrieval section 22. The load value is calculated by using a system 
result memory section 24 (step 44) and sends it to the call of the operating system or a load average of the CPU. 
retrieval result reply section 25. Otherwise, the retrieval 5 The load value may be the average value at predetermined 
result search section 23 sends the instruction to the retrieval interval. When the retrieval type determination section 22 
type determination section 22. The retrieval type determi- receives the load value, the retrieval type determination 
nation section 22 requests the first retrieval section 26 to section 22 decides whether the load value is below a 
retrieve by the first algorithm. The first retrieval section 26 predetermined threshold. If the load value is below the 
retrieves the database 28 by the first algorithm (step 45), 10 predetermined threshold (the system load is small), the 
writes the retrieval result in the retrieval result memory retrieval type determination section 22 asks the retrieval 
section 24, and sends it to the retrieval result reply section result searcn section 23 to search the retrieval expression 
25 (step 46). When the retrieval result reply section 25 wnose retrieval result is obtained by the first algorithm but 
receives the retrieval result from the retrieval result search not 0Dtaineci DV the second algorithm from the retrieval 
section 23 or the first retrieval section 26, the retrieval result is result memory section 24. Hereafter, processing is same as 
is sent to the user (step 47). me first embodiment. Therefore, explanation is omitted. 

On the other hand, when the retrieval type determination I ? G- 7 ^ a flow cnart of processing of the information 
section 22 does not receive the retrieval request, the instruc- retrieval method according to the second embodiment. In 
don of non-receiving is sent to the retrieval result search comparison with the flow chart of the first embodiment in 
section 23. The retrieval result search section 23 searches the M FIG - 4 > ste P s 78 Md 79 are different steps in the flow chart, 
retrieval expression, whose retrieval result is obtained by the In FIG - 7 > in case the retrieval request is not received (step 
first algorithm but not obtained by the second algorithm 71 ). * e retrieval type determination section 22 asks the load 
from the retrieval result memory section 24 (step 48). If the calculation section 61 for the load value. The load calcula- 
retrieval expression is searched (step 49), one retrieval tion . scction 61 calculates the load value (step 78). The 
expression is selected by the predetermined standard (step M retrieval . type determination section 22 compares the load 
410) and outputted to the retrieval type determination sec- value wiltl me predetermined threshold. If the load value is 
tion 22. The retrieval type determination section 22 requests below ^ Predetermined threshold (step 79), processing 
the second retrieval section 27 to retrieve by the second &om step 710 ^ continuously executed, 
algorithm. The second retrieval section 27 retrieves more ^ mentioned-above, in the second embodiment, the 
appropriate information from the database by the second 30 acIuaj ' oad of the system is monitored while the retrieval 
algorithm (step 411), and writes the retrieval result to the request is not received. Accordingly, if the load of the system 
retrieval result memory section 24. The flag "done" is ^ Di 8 n for some reason, the processing to highly raise the 
written in the check field 33 in the retrieval result memory load (second retrieval algorithm) is avoided. As a result, the 
section 24. system is effectively used. In case the predetermined server 

For example, the first retrieval algorithm retrieves infor- 35 3 , ia F1<3, 1 k milized as not onl y tie retrieval system but 
mation for the past one year from the database and the ? ^ various of processing systems, the second 

second retrieval algorithm retrieves information for the past emb odiment is especially useful. 

ten years. As another example, the first retrieval algorithm In tilc nrst md second embodiments, while another 
retrieves using a keyword (retrieval expression) only and the retrieval request is not received and so on, the second 
second retrieval algorithm retrieves by using not only the 40 retrieva l algorithm is executed for the retrieval expression 
keyword but also a synonym or a same category word. whose retrieval result is not obtained by the first retrieval 

As mentioned-above, if the retrieval request is received a }S ori ' hm - However, in the third embodiment, the retrieval 
and a system load becomes high, the first retrieval algorithm S °Ii * ex u ecuted according to a user's attribute profile 
whose accuracy is low is quickly executed. In this case the !? ! ?T '? Kln ? Vai 6X P ression - For example, assume 
response time to the user is short. On the other hand while 45 , * &TSt re . tneVal al S ori,hm and second retrieval 
another retrieval request is not received, the second retrieval ^"^ are already executed using a keyword "personal 
algorithm whose accuracy is high is executed In this case com P ut * f m the retrieval request. In this case, if many users 
the detailed information for the retrieval request received in 1! f re f ed "J 10 ?™" 1 ?"' me detaU information is pre- 

the near future is previously stored. As a result, in response \ t ^ keyWOrd " Personal Computer" 

to the retrieval request in the future, the retrieval result of 50 "J^ ^7™°* ^°~°»y"- 

high quality is quickly presented to the user 8 is a block diagram of the information retrieval 

In the first embodiment, while the retrieval request is not h - ^ ^ emb ° dMe j 1, • ln ^parison 

received, the second retrieval algorithm ^executed. sTclfsi Tnd ? n^'^f 3 ^ 1Df ° rma,i ° n 
However,inthesecondembodiment,bymonitoring the load 55 aSiS^Z 1D p f °™ atl ° n se f° Q 82 

of the system in addition to non-receiving of thTretrieval u a ,1 f • *T anL FlKt ' When ^ retneval re( l uest 

request/the second retrieval algorithm is decided To" e T XLT T™™ § * ^ 8 ^ 

executed. FIG. 5 is a time chart showing a change state of t^ r , T)f T, f^™ 8 ° D 21 *"* lhe 
the system load. As shown in FIG. 5, while the system load '° ^1™^ lnfonna J Uon u P datc 81 * " 

is low, the second retrieval algorithm is executed m ° f the T t," T ? ^ information 

Hr( ,, H t .. r .u ■ t . - 60 memory section 82, the profile information update section 

FTG^6 is block diagram of the information retrieval 81 secures an area for the user A in the profile information 
^^,2^ k f second embodiment In compari- memory section 82. The first retrieval section 26 or the 
son with the first embodiment in FIG. 2, a load calculation second retrieval section 27 extracts the predetermined ele- 
section 61 differs in the block diagram In case the retrieval menl (for example, a keyword) from the retrieval expression 
^ 6 1 r ! tneVal ^ determination 65 S, and sends the keyword to the profile information update 

fl , 4 f'f a f n , SeCU0D 61 f0f ,hC l08d Secti0D 8L ^ P rofl]e information update section 81 writes 
status of the system. The load calculation section 61 calcu- the keyword in the area of the user A of the profile infor- 
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mation memory section 82. In this case, the profile infor- 
mation memory section 82 may store the number of appear- 
ance frequency by unit of the keyword. In short, the retrieval 
corresponding to a different retrieval request by unit of the 
user may be executed. 

Next, assume that the first retrieval section 26 retrieves 
data from the database, stores n-units of high rank of the first 
retrieval results in the retrieval result memory section 24, 
and sends m-units (m<n) of high rank from the n-units 
retrieval results to the user A. Then, if the second retrieval 
section 27 executes the second retrieval algorithm by using 
the retrieval expression S in the same way as the first 
embodiment, n-units of high rank of the second retrieval 
results are also stored in the retrieval result memory section 
24. However, in this case, the n-units of high rank of the first 
retrieval results are not deleted in the retrieval result 
memory section 24. The n-units of high rank of the first 
retrieval results are irrelevant to the second retrieval results 
from (m+l)th-unit of high rank when the user A wants to 
watch the first retrieval results from (m+l)th-unit of high 
rank. In order to avoid this situation, in the case of the 
"WWW", the user is identified by using a "cookie" or "fat 
URL". As for the user A, the first retrieval results from 
(m+l)th-unit of high rank is presented. In this case, the 
storing time of the first retrieval results in the retrieval result 
memory section 24 is previously determined. When the 
storing time passed, the first retrieval results are deleted in 
the retrieval result memory section 24. Accordingly, if 
another user inputs the retrieval request including the 
retrieval expression S, the second retrieval results will be 
presented to the other user. 

Next, a modification of the third embodiment is 
explained. FIG. 9 shows a table of interested topic for each 
user stored in the profile information memory section 82. 
First, each user previously registers his interested topic to 35 
the profile information memory section 82 through the user 
terminal 1. Hereafter, whenever each user inputs the 
retrieval request including the retrieval expression, the pro- 
file information update section 81 counts the number of the 
retrieval request by unit of the retrieval expression for each 40 
interest topic. FIG. 10 shows a table of the counted number 
of retrieval requests by unit of the retrieval expression for 
each interest topic stored in the profile information memory 
section 82. In this way, when the retrieval request is not 
received, the second retrieval section 27 retrieves the detail 45 
information by using the retrieval expression with the largest 
number of retrieval requests. In FIG. 10, by using the 
keywords "SOCCER" and "WORLD CUV", the second 
retrieval algorithm is executed. Accordingly, the detail infor- 
mation in which many users are interested at the present time 50 
is previously stored, and quickly presented to the user when 
he inputs the retrieval request including this retrieval expres- 
sion. 

In case a quantity of information (for example, the 
document) as retrieval objects increases, the content of the 55 
database 28 is updated in proportion to the increased quan- 
tity of the information. In this case, the updated database 28 
often affects the retrieval results stored in the retrieval result 
memory section 24. For example, in case the retrieval results 
by a keyword "Personal Computer" is already stored in the 60 
retrieval result memory section 24 and a new document 
including the word "Personal Computer" is added to the 
database 28, the retrieval results by the keyword "Personal 
Computer" must be deleted in the retrieval result memory 
section 24. Therefore, in the fourth embodiment, if the new 6S 
document is added to the database, the word included in the 
new document is determined to coincide with the retrieval 



,605 Bl 

8 

expression in the retrieval result memory section 24. If the 
word coincides with the retrieval expression, the retrieval 
results are deleted in the retrieval result memory section 24. 
Furthermore, if the document is deleted from the database, 
the retrieval results including at least one part of the 
document, is determined to be stored in the retrieval result 
memory section 24. If the retrieval results, including at least 
one part of the document, are already stored, the retrieval 
results are deleted in the retrieval result memory section 24. 
As for the deleted retrieval results, it is possible for the user 
to retrieve again. Therefore, while the load of the retrieval 
engine is low, retrieval for the updated database is executed 
again by using same retrieval expression. In this case, if the 
retrieval results corresponding to a plurality of the retrieval 
expressions are deleted, the retrieval expression is selected 
in priority order to retrieve the updated database. For 
example, the retrieval expression of the largest number of 
retrieval requests, or the retrieval expression of the most 
recent use is selected. 

In the first, second, and third embodiments, the first 
retrieval algorithm and the second retrieval algorithm are 
used. However, a plurality of different lands of retrieval 
algorithms, i.e., more than two, may be used. For example, 
after the second retrieval algorithm is executed, the third 
retrieval algorithm whose accuracy is higher than the second 
retrieval algorithm may be executed during idle time of the 
computer. By repeating this processing, arbitrary steps of the 
retrieval algorithm whose accuracy is higher in order are 
used. 

Furthermore, as an example in the above-mentioned 
embodiment, the retrieval is executed for text data in the 
database. However, the retrieval may be executed for mul- 
timedia data in the database. For example, as for an image 
retrieval algorithm of low accuracy, a matching as a histo- 
gram level is executed by using a reduced image or a low 
resolution image. As for an image retrieval algorithm of high 
accuracy, a matching of high level is executed by using a 
high resolution image. 

A memory can be used to store instructions for perform- 
ing the process of the present invention described above. 
Such a memory can be a hard disk, optical disk, semicon- 
ductor memory, and so on. 

Other embodiments of the invention will be apparent to 
those skilled in the art from consideration of the specifica- 
tion and practice of the invention disclosed herein. It is 
intended that the specification and examples be considered 
as exemplary only, with the true scope and spirit of the 
invention being indicated by the following claims. 

What is claimed is: 

1. An information retrieval apparatus, comprising: 

a receiving unit configured to receive from a user, 
retrieval requests including a first retrieval request; 

a first retrieval unit configured to retrieve first information 
from a database in response to the first retrieval request, 
in accordance with a first retrieval algorithm that has a 
first retrieval accuracy; 

a retrieval result memory configured to store the first 
retrieval request and the first information; 

a retrieval result reply unit configured to send the first 
information to the user; and 

a second retrieval unit configured to use the stored first 
retrieval request to retrieve from the database, while a 
load of the apparatus is below a threshold, high quality 
information that has a higher quality than the first 
information, in accordance with a second retrieval 
algorithm that has a second retrieval accuracy that is 
higher than the first retrieval accuracy; 
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wherein said retrieval result memory stores the high the highest number of retrieval requests in a plurality of 

quality information retrieved by said second retrieval retrieval expressions if high quality information corre- 

unit; and sponding to the plurality of retrieval expressions is not 

wherein, if said receiving unit receives a second retrieval stored in said retrieval result memory. 

request that is identical to or similar to the stored first s 12. A method for retrieving information from a database, 

retrieval request, said retrieval result reply unit sends comprising: 

the high quality information stored in said retrieval receiving a first retrieval request from a user 

result memory to a user of the second retrieval request. first-retrieving first information from the database in 

wherei? 6 mf0rmatl ° Q "'"^ a PP aratus of cla ™ w response to the first retrieval request, according to a 

first retrieval algorithm that has a first retrieval accu- 

said second retrieval unit retrieves the high quality infor- racy; 

mation from the database while said first retrieval unit storing m a retrieval result m ^ first mfonnation 

does not process the first information in response to the ^ lQe ^ retrieva , . 
first retrieval request. 

3. The information retrieval apparatus of claim 1, further 15 S6adlng ^ flrSt ^formation to the user; 
comprising: while a system load is below a threshold, using the stored 

a load calculation unit that calculates the load of the first retrieval request in a step of second-retrieving from 

apparatus. me database, high quality information that has a higher 

4. The information retrieval apparatus of claim 1 further qUality thaD thc &st inform&ti °n, m accordance with a 
comprising: 20 second retrieval algorithm that has a second retrieval 

, „,„fli. ;„f c , • , accuracy that is higher than the first retrieval accuracy; 

a profile information memory configured to previously , • , . , <*iuavj, 

store an interested topic for each user code and to St ° n ° 8 ln rctricval result memory, the high quality 

count a number of retrieval requests for each retrieval ^formation retrieved in the second-retrieving step; and 

expression in each interested topic. 25 if a second retrieval request is received that is identical to 

5. The information retrieval apparatus of claim 4, or su rmar to 'he stored first retrieval request, sending 
wherein: the high quality information to a user of the second 

said second retrieval unit retrieves the high quality infor- retrieval request, 

mation from the database by both the interested topic . The method for retrieving information of claim 12, 

and the retrieval expression having the highest number 30 wnerein: 

of retrieval requests. the first information is retrieved at the first-retrieving step 

6. The information retrieval apparatus of claim 1, bv a retrieval expression in the first retrieval request; 
wherein: and 

said first retrieval unit retrieves the first information by at the high quality information is retrieved at the second- 
least one retrieval expression in the first retrieva] 35 retrieving step by a modified retrieval expression of the 
request, and retrieval expression. 

said second retrieval unit retrieves the high quality infor- 14 - . Tne metn °d for retrieving information of claim 12, 
mation by a modified retrieva] expression of the at least wherein: 

one retrieval expression. the first information is retrieved at the first-retrieving step 

7. The information retrieval apparatus of claim 6, 40 °y a retrieval expression in the first retrieval request; 
wherein: and 

said retrieval result memory stores the high quality infor- wherein the high quality information is retrieved at the 

mation retrieved by said second retrieval unit, for each second-retrieving step with integrated algorithm of 

retrieval expression. word-base search and character-base search. 

8. The information retrieval apparatus of claim 7, 45 15. The method for retrieving information of claim 12, 
wherein: wherein: 

said second retrieval unit retrieves the high quality infor- the first information is retrieved at the first-retrieving step 

mation from the database by the modified retrieva] °y a retrieval expression in the first retrieval request; 

expression if the high quality information correspond- and 

ing to the retrieval expression is not stored in said 50 the high quality information is retrieved at the second- 

o r !f eval f resmt memory. retrieving step from the database including more infor- 

9. The information retrieval apparatus of claim 8, mation in terms of time from older to newer, 
wherein: l fi - The method for retrieving information of claim 12, 

said retrieval result reply unit sends the high quality 55 wherein: 

information to the user in response to a retrieval request the high quality information is retrieved at the second- 
if the high quality information corresponding to the retrieving step while the first information is not pro- 
retrieval expression in the retrieva] request is stored in cessed in response to the first retrieva] request, 
said retrieva] result memory. 17. The method for retrieving information of claim 12, 

10. The information retrieval apparatus of claim 7, 60 further comprising: 

wherein said retrieval result memory additionally stores a storing an interested topic for each user code- and 

number of retrieval requests with the information for counting a number of retrieval requests for each retrieval 

each retrieval expression. expression in each interested topic. 

11. The information retrieval apparatus of claim 10, 18. The method for retrieving information of claim 17 

wherein said second retrieval unit retrieves the high 65 wherein- 
quality information from the database by the modified the high quality information is retrieved at the second- 
retneval expression of one retrieval expression having retrieving step by both the interested topic and the 
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retrieval expression having the highest number of 
retrieval requests. 

19. The method for retrieving information of claim 12, 
further comprising: 

calculating the system load. 5 

20. The method for retrieving information of claim 13, 
wherein: 

the high quality information is stored at the storing step io 
the retrieval result memory for each retrieval expres- 
sion. 10 

21. The method for retrieving information of claim 20, 
wherein: 

the high quality information is retrieved at the second- 
retrieving step by the modified retrieval expression if 
the high quality information corresponding to the 
retrieval expression is not stored in the retrieval result 
memory. 

22. The method for retrieving information of claim 21, 
wherein: 

20 

the high quality information is sent at the sending step to 
the user in response to the retrieval request if the high 
quality information corresponding to the retrieval 
expression in the retrieval request is stored in the 
retrieval result memory. ^ 

23. The method for retrieving information of claim 20, 
further comprising: 

storing a number of retrieval requests with the informa- 
tion retrieved at the first-retrieving step for each 
retrieval expression. 30 

24. The method for retrieving information of claim 23, 
wherein: 

the high quality information is retrieved at the second- 
retrieving step by the modified retrieval expression of 
one retrieval expression having the highest number of 35 
retrieval requests in a plurality of retrieval expressions 
if high quality information corresponding to the plu- 
rality of retrieval expressions is not stored in the 
retrieval result memory. 
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25. A computer readable memory containing computer 
readable instructions to retrieve information from a 
database, the computer readable memory comprising: 

instruction means for causing a computer to receive a first 
retrieval request from a user; 

instruction means for causing the computer to retrieve 
first information from the database in response to the 
first retrieval request, in accordance with a first retrieval 
algorithm that has a first retrieval accuracy; 

instruction means for causing the computer to store in a 
retrieval result memory, the first information and the 
first retrieval request; 

instruction means for causing the computer to send the 
first information to the user; 

instruction means for causing the computer to use the 
stored first retrieval request to retrieve from the 
database, while a system load is below a threshold, high 
quality information that has a higher quality than the 
first information, according to a second retrieval algo- 
rithm that has a second retrieval accuracy that is higher 
than the first retrieval accuracy; 

instruction means for causing the computer to store in the 
retrieval result memory, the high quality information 
retrieved according to the second retrieval algorithm; 
and 

instruction means for causing the computer, if a second 
retrieval request is received that is identical to or 
similar to the stored first retrieval request, to send the 
high quality information to a user of the second 
retrieval request. 

26. The computer readable memory of claim 25, further 
comprising: 

instructions means for causing the computer to calculate 
the system load. 

* ♦ * * » 
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